1. YouTube Summaries
  2. AI Weekly Roundup: Gen 3, Voice Tech, and Copyright Debates

AI Weekly Roundup: Gen 3, Voice Tech, and Copyright Debates

By scribe 6 minute read

Create articles from any YouTube video or use our API to get YouTube transcriptions

Start for free
or, create a free article to see how easy it is.

Gen 3 Video Generation Goes Public

This week marked a significant milestone in AI-generated video with the public release of Gen 3 access. While this cutting-edge technology is currently limited to pro users on Runway, it represents a major leap forward in text-to-video generation capabilities.

What Gen 3 Offers

Gen 3 allows users to input text prompts and generate corresponding videos. For example, a prompt like "a bald eagle flying in front of an American flag with fireworks in the background" will produce a video attempt matching that description. While the results may not always be perfect, Gen 3 currently stands as the most advanced text-to-video generator available to the public.

Comparison to Image-to-Video Tools

It's worth noting that for certain applications, image-to-video tools like Luma AI may still produce more impressive results. Users can generate an image using a text prompt, then upload that image to Luma AI to create a video. This two-step process can yield higher quality outputs for some scenarios.

Advancements in Voice Technology

11 Labs Updates

11 Labs, a leader in voice AI, introduced several exciting updates this week:

  1. Famous Voices in Reader App: The company added iconic voices like Judy Garland, James Dean, Bert Reynolds, and Sir Lawrence Olivier to their reader app. Importantly, these additions were made with proper permissions and compensation agreements with the respective estates.

  2. Voice Isolator Feature: 11 Labs unveiled a new tool that can clean up audio by removing background noise, resulting in crystal-clear voice recordings.

Sonno Music Generation App

Sonno, known for its AI-powered music generation, released a mobile app this week. Currently available only on iOS, the app offers similar functionality to its web counterpart, making it easier for users to create AI-generated music on the go.

3D Image Generation Breakthroughs

Meta announced new research in text-to-3D image generation called 3D GenIn. This technology allows users to input text prompts and receive fairly high-quality 3D image outputs. While still in the research phase, this development could potentially accelerate processes in game development and 3D asset creation for various industries.

Open-Source AI Developments

Kotai's Voice Model

Kotai, an open-source AI research lab, released a new voice model that aims to compete with GPT-4's advanced voice capabilities. What sets this apart is its open-source nature, allowing other companies to build upon and improve the technology. While the current version may sound robotic, its real-time response capabilities are impressive.

InternLM 2.5: Large Language Model with Massive Context Window

A new open-source large language model, InternLM 2.5, was made available on Hugging Face. Its standout feature is a 1 million token context window, rivaling proprietary models like Google's Gemini (which has a 2 million token window). This development opens up new possibilities for developers and researchers working with extensive context requirements.

Browser and Search Engine AI Integration

Brave Browser Custom AI Models

The Brave browser updated its Leo AI feature to allow users to bring their own AI models. This customization option expands the browser's AI capabilities beyond the pre-installed models like Mixtral, Claude, and LLaMA.

Perplexity Pro Search Enhancements

Perplexity's Pro Search received an update introducing multi-step reasoning capabilities. This improvement allows the AI to better understand complex queries, work through goals step-by-step, and synthesize in-depth answers more efficiently. The update also improved math and programming capabilities through integration with Wolfram Alpha.

Industry Moves and Partnerships

Apple Joins OpenAI's Board

In an interesting development, Apple secured an observer role on OpenAI's board. While this position doesn't come with voting rights, it's a significant move considering Apple and Microsoft (another OpenAI partner) are major competitors in the tech industry.

Potential Apple-Google Gemini Partnership

Rumors suggest that Apple might partner with Google's Gemini AI alongside its existing collaboration with OpenAI. This potential move could offer users more AI options within Apple's ecosystem.

New Lawsuit Against OpenAI and Microsoft

The Center for Investigative Reporting filed a lawsuit against OpenAI and Microsoft, alleging copyright infringement. This action joins a growing list of legal challenges faced by AI companies regarding the use of copyrighted material in training datasets.

Controversial Statements on Web Content Usage

Mustafa Suleyman, a prominent figure in the AI industry, made statements suggesting that content openly available on the web since the 1990s should be considered fair use for AI training. This perspective has sparked debate about the boundaries of copyright in the digital age and the responsibilities of AI companies in using publicly accessible data.

Content Protection and AI Training

Cloudflare's Anti-Scraping Solution

Cloudflare introduced a new feature allowing website owners to prevent AI scrapers from accessing their content. This tool is available to both free and paid Cloudflare users, offering a way for content creators to protect their work from unauthorized use in AI training.

Figma's AI Training Announcement

Figma announced its intention to use user-generated content within its platform to train AI models for better understanding design concepts and patterns. While they plan to offer an opt-out option, this move has raised questions about data usage and user privacy in design tools.

Social Media and AI

YouTube's AI Content Removal Policy

YouTube implemented a new policy allowing content creators to request the removal of AI-generated content that simulates their likeness or voice. This move aims to protect creators from unauthorized AI-generated impersonations.

Instagram's AI Labeling Update

Instagram refined its AI content labeling system. Instead of broadly labeling images as "made with AI," the platform now uses the term "AI info" and provides more detailed metadata about AI involvement in image creation or editing.

Upcoming AI Releases and Features

Grok 2 Announcement

Elon Musk announced that Grok 2, an updated version of the AI model, is set to release in August. He emphasized improvements in data purging techniques to address concerns about AI models training on each other's outputs.

WhatsApp's AI Image Generation

Leaked screenshots suggest that WhatsApp is developing an AI image generation feature similar to Apple's recently announced capabilities. This feature would allow users to create cartoon or stylized versions of their profile pictures.

Emerging Technologies

Open Television Robot Control

A fascinating development in robotics and virtual reality was showcased with the "Open Television" project. This technology allows users to control robots remotely using VR headsets like the Apple Vision Pro, enabling precise manipulation of robotic systems from thousands of miles away.

Conclusion

Despite being a holiday week in the United States, the AI industry continued to see significant developments across various sectors. From advancements in video and voice generation to ongoing debates about copyright and data usage, the field remains dynamic and fast-paced.

As AI technologies become more integrated into our daily lives and work processes, it's crucial to stay informed about these developments. The balance between innovation and ethical considerations continues to be a central theme in the industry, with companies and researchers working to push the boundaries of what's possible while addressing concerns about privacy, copyright, and responsible AI use.

For those interested in staying up-to-date with the latest in AI and emerging technologies, resources like the Future Tools newsletter offer regular insights into new tools and industry trends. As we move forward, the intersection of AI with other technologies like robotics and virtual reality promises to open up new frontiers in how we interact with and leverage artificial intelligence in both personal and professional contexts.

Article created from: https://youtu.be/625DnCyhI20?feature=shared

Ready to automate your
LinkedIn, Twitter and blog posts with AI?

Start for free