- The Prompt Mail
- Posts
- Meta introduces Llama 3: the most advanced open-source model
Meta introduces Llama 3: the most advanced open-source model
Mark Zuckerberg raises the bar among open-source generative models.
Fonte: Meta
Good morning!
In this edition, we highlight the launch of Llama 3 by Meta, a significant advancement over the previous model that raises the standards of open-source language models. We'll explore these and other developments that are shaping the future of technology and society.
In today's Mail:
Meta introduces Llama 3, its most advanced model
Instagram advances AI for digital influencers
Boston Dynamics unveils the new Atlas robot
x.AI launches Grok-1.5V, its multimodal model
AI surpasses humans in air combat simulations
Adobe introduces AI video editing in Premiere Pro
Microsoft Research presents a model for realistic talking faces
Study shows AI performance comparable to ophthalmologists
And more: a band discusses the role of AI in music; Autodesk explores the impact of AI on creative professions; new wearable device for automatic recording and transcription; criticisms of A24 for using AI in marketing the movie "Civil War"; JP Morgan Chase's vision of the future of AI in banking and other news.
Nanotutorial: How to generate "3D renderings" from simple sketches.
Reading time: 10 minutes
NEWS OF THE WEEK
META
Meta has announced the launch of Llama 3, an open-source language model designed to surpass current standards in artificial intelligence. This new model, available in versions with 8B and 70B parameters, promises significant improvements in tasks such as reasoning and code generation, fine-tuned through prior instruction for greater accuracy. Llama 3 is also part of an effort to foster responsible AI innovations, including security tools like Llama Guard 2 and Code Shield.
In addition to its advanced technical capabilities, Llama 3 will be made available on popular platforms such as AWS, Google Cloud, and Microsoft Azure, promoting accessibility and collaboration within the developer community. Meta aims to continue its leadership in open-source language models, reinforcing its commitment to openness and responsibility in AI development.
BOSTON DYNAMICS
Boston Dynamics has announced a significant evolution for its humanoid robot, Atlas, marking the transition from a hydraulic system to a fully electric model. This change comes after decades of development and is an important step for the company, especially following its acquisition by Hyundai. The new Atlas is designed to be more efficient and adaptable, ideal for integration into existing industrial environments.
The new model features several significant improvements over its previous version. Among these is the ability to initiate movements from a prone position, which allows it to stand up by itself after falling, enhancing its practical functionality and autonomy in work environments. Additionally, Atlas has an advanced control system that grants it more precise and agile movements, akin to those of an elite athlete.
xAI
xAI has announced the launch of Grok-1.5V, a multimodal artificial intelligence model with enhanced performance across a variety of domains, from multidisciplinary reasoning to understanding documents, scientific diagrams, graphs, screenshots, and photographs. This model stands out for its superior ability to comprehend and interact with the physical world, surpassing competing models in various evaluation metrics.
Grok-1.5V is designed to understand both text and images, making it a valuable tool for tasks requiring complex integration of different data types. This capability enables applications such as enhanced analysis of complex documents and the creation of informative visual content from textual descriptions. Moreover, xAI emphasizes its commitment to the democratization of AI, offering this advanced model as an accessible and open tool for researchers and developers.
DARPA
The recent development of an artificial intelligence program that allowed autonomous control of a combat aircraft in simulations against human pilots marks a significant advancement in military aviation. This project was part of the Air Combat Evolution (ACE) program, a collaboration between DARPA and other military institutions, such as the Air Force Test Pilot School and the Air Force Research Laboratory.
During the final event of a three-day competition, the experimental X-62 VISTA aircraft, controlled by AI, demonstrated superior skills in accuracy and quick reactions, facing human pilots in simulated scenarios. This competition included various AI programs developed by several companies, where each learned through thousands of simulations. The winning AI exhibited aggressive behavior and leveraged safety limitations imposed on the human pilot to achieve significant advantages during combat.
META
Instagram is testing a new program that allows digital influencers to use artificial intelligence chatbots to interact with their followers. This program, still in the early testing phase and known as "Creator AI," aims to enable influencers to communicate directly with fans via direct messages and, potentially, comments on the platform in the future. This initiative reflects a broader effort by Meta, Instagram's parent company, to integrate AI technologies into its products and services. According to reports, the chatbot would emulate the "voice" of the influencers to respond to fans, and messages would be automatically sent, with an initial indication that they were generated by AI.
The program aims to allow creators with large audiences to connect more efficiently with their followers, reducing the workload involved in personally responding to a large volume of messages and comments. Meta is aggressively investing in the incorporation of artificial intelligence across all aspects of its business, from improving its advertising systems to implementing AI assistants in smart glasses and other hardware projects. Mark Zuckerberg, CEO of Meta, has highlighted AI as a "growing engine of opportunity" for individuals, businesses, and the economy in general.
ADOBE
Adobe Premiere Pro is set to revolutionize video editing workflows with the integration of generative artificial intelligence. This new approach aims to enhance the creativity and efficiency of professionals by allowing the addition and removal of objects in scenes, as well as the generation of new frames and b-roll from text prompts. These tools, part of the Firefly video model from Adobe, along with third-party models like Sora from OpenAI and RunwayML, promise to transform video editing, making it more flexible and powerful.
Adobe is introducing features such as "Generative Extend," which allows extending a clip by adding generated frames at the beginning or end, and "Addition and Removal of Objects" tools, which facilitate the manipulation of elements within a video. Additionally, "Generative B-Roll" will enable the creation of supplementary scenes relevant to the narrative, generated automatically based on textual descriptions. The integration of these innovative features is scheduled to be launched by the end of 2024, marking a new era in digital content production from Adobe.
MICROSOFT
Microsoft Research has launched VASA-1, a new technology capable of generating realistic talking faces from a single still photo and an audio file. This advanced technology, notable for its perfect synchronization between lip movements and audio, as well as natural facial and head dynamics, represents a qualitative leap in the creation of virtual characters. Using artificial intelligence, VASA-1 allows the creation of videos of talking faces in real-time, offering potential applications in gaming, virtual reality, and interactive media.
The model, besides impressing with image quality and processing speed, also captures facial expressions and head movement nuances that contribute to a more authentic and immersive visual experience. This technology can be especially useful in sectors such as education, where realistic avatars can serve as virtual assistants, and in entertainment, where it can enrich interactivity and user experience.
RESEARCH
A recent study published in "PLOS Digital Health" revealed that the latest generation of language models is nearly reaching the level of experts in clinical knowledge in ophthalmology. The study compared the performance of various language models, including GPT-3.5, GPT-4, PaLM 2, and LLaMA, with that of ophthalmologists and doctors in training through a simulated exam. GPT-4 showed performance superior to GPT-3.5 and comparable to specialized ophthalmologists, with an average accuracy rate of 69% compared to the experts' 76%. This advancement indicates the potential of language models to provide accurate medical advice where access to specialists is limited.
The results suggest that while language models still face challenges in terms of specific knowledge and practical clinical application, they are already equipped to significantly assist in the field of ophthalmology. The study also highlights the need for robust clinical benchmarks to better assess the capabilities of language models in healthcare settings before they can be designed and conducted in clinical trials. The research was funded by renowned entities, including the National Medical Research Council of Singapore and the Duke-NUS Medical School, highlighting substantial support for advancements at the intersection of artificial intelligence and healthcare.
SHORTS
🎵 M. Shadows, lead vocalist of the band Avenged Sevenfold, stated that music fans will not mind if it is created by AI or humans. He considers AI a valuable tool to expand creativity and personalize music, allowing new forms of artistic expression. Learn more.
🎨 A recent study by Autodesk discussed on Creative Bloq addresses the impact of artificial intelligence on creative professions, suggesting that AI can both enhance productivity and challenge the originality and authenticity of creative works. The increasing integration of AI tools is changing traditional creative processes, requiring adaptations and innovations from professionals. Learn more.
🏌️♂️ New AI functionalities have been integrated into The Masters tournament app to enhance the experience for golf fans. The "Hole Insights" feature analyzes historical data to predict outcomes based on ball positions, while AI-generated narrations in English and, eventually, Spanish, enrich the visual experience with detailed commentary. Learn more.
🎤 Limitless has announced the AI Pendant, a wearable device that promises to transform meetings and daily interactions by automatically recording and transcribing conversations. With a focus on privacy, the Pendant uses encryption to protect recorded data and includes a "consent mode" for ethical recording. Offered at $99, with additional functionalities available via subscription, this pendant promises to be a useful tool for daily life. Learn more.
🖼️ Stability AI has launched the API for Stable Diffusion 3, a significant evolution in text-based image generation systems. This advanced model promises greater adherence to prompts and improvements in typography, surpassing previous versions and other market systems. Available on the Stability AI development platform, the model will also be accessible for self-hosting soon, aligned with the company's commitment to open generative AI. Learn more.
🌐 The French startup Mistral AI is in negotiations to raise several million dollars, aiming for a valuation of $5 billion. Following a recent funding round of $415 million, Mistral, which develops open-source artificial intelligence, continues to attract strong investor interest. Learn more.
📱 Snapchat has enhanced its artificial intelligence policies, focusing on greater transparency and safety. The company announced that it will introduce watermarks on AI-generated images on the platform. Learn more.
🎥 A24 faces criticism after releasing AI-generated posters for the movie "Civil War," showing apocalyptic scenes of major US cities that do not appear in the film. This use of AI-generated content has raised questions about authenticity and the replacement of humans by machines. Learn more.
🏦 Jamie Dimon, CEO of JPMorgan Chase, outlined his vision for the future of money in a banking world dominated by AI, comparing its potential impact to historical innovations such as the printing press and the steam engine. He predicts that AI will revolutionize banking, improving services and being integrated into almost all banking processes as a co-pilot. However, Dimon also warns about the risks of AI, including the difficulty in understanding decisions and the potential for increased market volatility. Learn more.
VIDEO OF THE WEEK
Boston Dynamics surprised the industry last week with a demonstration video of the new version of its Atlas robot. Click here or on the image above to watch.
TOOLS
🧠 Wonders: an artificial intelligence-driven research tool that allows anyone to learn directly from reliable scientific sources. With access to over 270 million scientific documents, Wonders provides quick and easy-to-understand insights into complex questions, covering a wide range of topics. Link.
🎨 Spline AI 3D Generation: an advanced design platform that transforms textual descriptions and 2D images into detailed 3D objects. This system allows users to create, edit, and generate textures for 3D objects using simple text commands, making the conception and realization of ideas in real-time. With an intuitive interface and no need for prior 3D modeling experience, Spline AI makes 3D creation accessible to everyone. Link.
🔊 OptimizerAI: an artificial intelligence tool that transforms textual descriptions into custom sound effects for games and video projects. With an intuitive interface, OptimizerAI enables game designers and content creators to generate high-quality sound effects quickly and efficiently, without the need for advanced technical knowledge in audio production. Link.
🌐 TripoSR: a 3D model generation tool from single images, developed in collaboration between Stability AI and Tripo AI. This artificial intelligence model can create detailed 3D reconstructions quickly and efficiently, particularly suited to fields such as entertainment, industrial design, and architecture. Link.
NANOTUTORIAL
🎨How to "Render" a Simple Sketch with AI
by Javi Lopez
Yes! You can transform an image with just scribbles into a complete 3D rendering simulation with up to 10K resolution using Magnific. Step by step:
Your original image can be as simple as this image:
Then, you just need to perform a Style Transfer + Enlargement on Magnific et voilà! The resulting image will inherit the style of your reference image and you can guide the generation details (material description, textures) using the prompt.
And yes, you know you can experiment with the reference image, testing various prompts and parameters!