Alert button
Picture for Alexander Martin

Alexander Martin

Alert button

Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation

Aug 14, 2023
Alexander Martin, Haitian Zheng, Jie An, Jiebo Luo

Figure 1 for Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation
Figure 2 for Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation
Figure 3 for Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation
Figure 4 for Jurassic World Remake: Bringing Ancient Fossils Back to Life via Zero-Shot Long Image-to-Image Translation

With a strong understanding of the target domain from natural language, we produce promising results in translating across large domain gaps and bringing skeletons back to life. In this work, we use text-guided latent diffusion models for zero-shot image-to-image translation (I2I) across large domain gaps (longI2I), where large amounts of new visual features and new geometry need to be generated to enter the target domain. Being able to perform translations across large domain gaps has a wide variety of real-world applications in criminology, astrology, environmental conservation, and paleontology. In this work, we introduce a new task Skull2Animal for translating between skulls and living animals. On this task, we find that unguided Generative Adversarial Networks (GANs) are not capable of translating across large domain gaps. Instead of these traditional I2I methods, we explore the use of guided diffusion and image editing models and provide a new benchmark model, Revive-2I, capable of performing zero-shot I2I via text-prompting latent diffusion models. We find that guidance is necessary for longI2I because, to bridge the large domain gap, prior knowledge about the target domain is needed. In addition, we find that prompting provides the best and most scalable information about the target domain as classifier-guided diffusion models require retraining for specific use cases and lack stronger constraints on the target domain because of the wide variety of images they are trained on.

* 9 pages, 10 figures, ACM Multimedia 2023 
Viaarxiv icon

MegaWika: Millions of reports and their sources across 50 diverse languages

Jul 13, 2023
Samuel Barham, Orion Weller, Michelle Yuan, Kenton Murray, Mahsa Yarmohammadi, Zhengping Jiang, Siddharth Vashishtha, Alexander Martin, Anqi Liu, Aaron Steven White, Jordan Boyd-Graber, Benjamin Van Durme

Figure 1 for MegaWika: Millions of reports and their sources across 50 diverse languages
Figure 2 for MegaWika: Millions of reports and their sources across 50 diverse languages
Figure 3 for MegaWika: Millions of reports and their sources across 50 diverse languages
Figure 4 for MegaWika: Millions of reports and their sources across 50 diverse languages

To foster the development of new models for collaborative AI-assisted report generation, we introduce MegaWika, consisting of 13 million Wikipedia articles in 50 diverse languages, along with their 71 million referenced source materials. We process this dataset for a myriad of applications, going beyond the initial Wikipedia citation extraction and web scraping of content, including translating non-English articles for cross-lingual applications and providing FrameNet parses for automated semantic analysis. MegaWika is the largest resource for sentence-level report generation and the only report generation dataset that is multilingual. We manually analyze the quality of this resource through a semantically stratified sample. Finally, we provide baseline results and trained models for crucial steps in automated report generation: cross-lingual question answering and citation retrieval.

* Submitted to ACL, 2023 
Viaarxiv icon

An Online-Learning Approach to Inverse Optimization

Oct 30, 2018
Andreas Bärmann, Alexander Martin, Sebastian Pokutta, Oskar Schneider

Figure 1 for An Online-Learning Approach to Inverse Optimization
Figure 2 for An Online-Learning Approach to Inverse Optimization
Figure 3 for An Online-Learning Approach to Inverse Optimization
Figure 4 for An Online-Learning Approach to Inverse Optimization

In this paper, we demonstrate how to learn the objective function of a decision-maker while only observing the problem input data and the decision-maker's corresponding decisions over multiple rounds. Our approach is based on online learning and works for linear objectives over arbitrary feasible sets for which we have a linear optimization oracle. As such, it generalizes previous approaches based on KKT-system decomposition and dualization. The two exact algorithms we present -- based on multiplicative weights updates and online gradient descent respectively -- converge at a rate of O(1/sqrt(T)) and thus allow taking decisions which are essentially as good as those of the observed decision-maker already after relatively few observations. We also discuss several useful generalizations, such as the approximate learning of non-linear objective functions and the case of suboptimal observations. Finally, we show the effectiveness and possible applications of our methods in a broad computational study.

Viaarxiv icon