Apple, the tech giant known for its innovative hardware products, has made a significant move in the artificial intelligence (AI) space. The company has introduced an open-source AI model called MLLM-Guided Image Editing (MGIE), designed to revolutionize image editing capabilities. Developed in collaboration with researchers from the University of California, Santa Barbara, the MGIE AI model was recently presented at the International Conference on Learning Representations (ICLR) 2024.
One of the key features of the MGIE AI model is its instruction-based image editing approach, which allows for more intricate image manipulations. Instead of relying on detailed prompts, users can give natural language commands to MGIE. For instance, a command like make the colour of the grass more green translates to MGIE increasing the saturation of the grass by a certain percentage. The model leverages Multimodal Large Language Models (MLLMs) to generate visual representations of these commands and manipulate them down to the pixel level.
With the power of AI behind it, MGIE is capable of performing a wide range of image editing tasks. It can handle Photoshop-style modifications, optimize photos globally, and perform local editing. This means that users can expect expressive and concise commands to achieve edits like image cropping, resizing, object removal, applying filters, and more. Moreover, MGIE can transform various aspects of an image, such as brightness, saturation, sharpness, color balance, exposure, while also applying artistic effects like sketching, watercoloring, and pop art.
What sets MGIE apart is its ability to not only edit the entire image but also manipulate specific parts of it. This includes removing or moving objects and adjusting their attributes such as texture, color, shape, style, and size. According to the research paper, experiments have shown that expressive instructions significantly improve the effectiveness of instruction-based image editing, leading to better results in automatic metrics and human evaluation.
This move by Apple signifies the company’s growing commitment to AI technology, not just in consumer products but across various domains. While Apple has long been known for its hardware advancements and voice assistant Siri, the release of the MGIE AI model demonstrates their ambition to compete in the broader AI landscape.
In conclusion, Apple’s introduction of the open-source MGIE AI model brings advanced image editing capabilities to the forefront. With instruction-based editing and the power of MLLMs, users can manipulate images in a more expressive and efficient way. This development showcases Apple’s dedication to AI innovation and positions them as a strong contender in the AI sphere.
Referenced Articles:
– [Apple introduces MGIE AI model for image editing; Know how to use it and what it can do]
– [Apple’s open-source AI model, MGIE, is capable of performing various image editing tasks using instruction-based editing and Multimodal Large Language Models (MLLMs).