Everything New in The World of AI!
Based on MattVidPro's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
OpenAI made DALL·E 3 available to ChatGPT Plus users inside the ChatGPT mobile app, enabling on-phone image generation.
Briefing
OpenAI has rolled out DALL·E 3 across ChatGPT Plus, and the biggest practical change is that DALL·E 3 is now available directly inside the ChatGPT mobile app. That means subscribers can generate images on their phones without switching tools—though some users complain the ChatGPT interface is more restrictive, blocking requests involving famous characters unless users bypass safeguards. For beginners, the convenience still makes ChatGPT a low-friction entry point, even if power users often prefer Microsoft Bing’s comparatively looser environment for image generation.
Midjourney is responding on two fronts: higher-resolution output and a push toward a broader web and mobile experience. A new upscaler with 2X and 4X options is live, targeting up to 16 megapixels—described as “photograph size.” Early results look detailed at a distance and natural overall, but close-up inspection shows the upscaling can behave more like a detail enhancer than a perfect recreation of fine textures, and background elements sometimes remain inconsistent. Midjourney is also testing a redesigned website in beta that loads faster and improves browsing, but generation on the site is not yet available in that beta. Access requires at least 1,000 generations on an account, suggesting Midjourney is gating features while it stabilizes the new platform.
The most notable Midjourney-related move may be the launch of a mobile app ecosystem under the name “Nii Journey,” built with Spellbrush and positioned as an AI anime generator. The app supports prompt-based generation, image-to-image, and community-style live feeds where users can watch others’ generations in real time. Midjourney images can also be generated inside the app, and the social, feed-driven design hints at a strategy to reduce reliance on Discord—where Midjourney’s workflow has historically lived.
Beyond image generation, the roundup highlights open-source progress in “audio understanding.” A model called SALMON (Speech Audio Language Music Open Neural network) can analyze audio to produce text descriptions and interpretations, including recognizing background sounds such as gunfire and explosions. The same system can also respond to uploaded music—turning a piano-and-vocal track into an interpretive description of mood and structure. The practical implication is accessibility: audio-to-text systems could help people who are deaf by describing what’s happening around them without needing other people as intermediaries.
A separate AI art development, “PixArt Alpha,” is pitched as a fast-training text-to-image diffusion transformer that can train in about 10% of the time compared with Stable Diffusion-family approaches, with claims of competitive image quality against models like Google’s IM and Midjourney. The efficiency angle matters because it could make home experimentation more feasible, especially since the project plans to co-release code and weights.
On the autonomous-agent front, Hyperight’s assistant (run by CEO Matt Schumer) received an update that improves task completion and adds source citation. A demo shows the assistant searching for restaurants, then making an OpenTable reservation by navigating the booking flow and filling in details automatically. Finally, Meta shared research toward near-real-time decoding of visual perception from brain activity using a system called “MEG,” producing rough reconstructions of what participants viewed after only one second—accurate enough to identify broad categories and colors, while still missing fine details like faces.
Cornell Notes
OpenAI’s DALL·E 3 is now available to ChatGPT Plus users inside the ChatGPT mobile app, making image generation more convenient—though some requests are blocked more often than in other interfaces. Midjourney is upgrading output with a new 2X/4X upscaler aimed at up to 16 megapixels, testing a faster website redesign, and launching a mobile app ecosystem (Nii Journey) that emphasizes community-style live generation. Open-source SALMON pushes audio understanding by converting audio (including background events and music) into text descriptions, with potential accessibility benefits. PixArt Alpha claims much faster training for text-to-image diffusion models and plans to release code and weights, lowering barriers to experimentation. Hyperight’s autonomous assistant update demonstrates end-to-end task execution like booking an OpenTable reservation, while Meta’s MEG research moves toward decoding visual perception from brain activity.
What changed for DALL·E 3 users, and why does it matter in practice?
How does Midjourney’s new upscaler change the quality and workflow of generated images?
Why is Midjourney’s move toward a mobile app (Nii Journey) strategically important?
What can SALMON do with audio, and what real-world use cases does that enable?
What’s the significance of PixArt Alpha’s claimed training speed?
How does Hyperight’s autonomous assistant update demonstrate real autonomy?
What does Meta’s MEG research claim to achieve, and what are its limits?
Review Questions
- Which interface—ChatGPT mobile or Microsoft Bing—was described as having more restrictions for DALL·E 3 requests, and what kinds of requests were mentioned?
- What resolution target and upscaling options did Midjourney introduce, and what quality tradeoffs were observed when zooming in?
- How do SALMON and MEG differ in input type and output type, and what potential applications were highlighted for each?
Key Points
- 1
OpenAI made DALL·E 3 available to ChatGPT Plus users inside the ChatGPT mobile app, enabling on-phone image generation.
- 2
ChatGPT’s DALL·E 3 workflow may block requests involving famous characters more often than Microsoft Bing, pushing some users to alternative interfaces.
- 3
Midjourney added 2X/4X upscaling aimed at up to 16 megapixels, improving usability at higher resolution while sometimes struggling with background detail.
- 4
Midjourney’s redesigned website is in beta with faster loading, but image generation on the site is not yet enabled; access is gated by account generations.
- 5
Midjourney’s Nii Journey mobile app emphasizes community live feeds and supports prompt-based and image-to-image generation, potentially reducing reliance on Discord.
- 6
SALMON is an open-source audio-to-text model that can interpret background events and music, with accessibility implications such as describing environments for people who are deaf.
- 7
Hyperight’s autonomous assistant update demonstrates end-to-end task completion, including making an OpenTable reservation and sending a researched marketing email with citations.