I tried 8 of our newest AI products and updates

6 months ago 81

Jun 05, 2024

[[read-time]] min read

Here's 1 Googler's acquisition trying retired our newest AI products and updates astatine Google I/O 2024.

Chaim Gartenberg

Keyword Contributor

General summary

At Google I/O, attendees got hands-on acquisition with the latest AI products and updates. Gemini 1.5 Pro, a large-scale instauration model, tin present summarize and analyse documents up to 1,500 pages agelong and is integrated into Workspace apps similar Gmail and Docs. Imagen 3, a text-to-image model, tin make decorative substance and letters. Gemini's overlay connected Android phones provides context-aware suggestions and answers questions astir contented connected the screen. Project Astra, a multimodal conversational AI, understands code prompts and unrecorded video feeds, enabling caller experiences similar playing Pictionary.

Summaries were generated by Google AI. Generative AI is experimental.

Bullet points

  • Google I/O lawsuit showcased caller AI products and updates.
  • Gemini 1.5 Pro offers improved agelong discourse model and papers analysis.
  • Imagen 3 generates high-quality text-to-image, including decorative text.
  • Gemini overlay connected Android provides context-aware suggestions for substance and videos.
  • Project Astra combines code and ocular inputs for interactive experiences.

Summaries were generated by Google AI. Generative AI is experimental.

Shakespeare-ish

At Google I/O, caller AI revealed,
Gemini's prowess, our minds are sealed.
Parsing leases, texts, with easiness untold,
Knowledge distilled, successful seconds, bold.

Imagen's artistry, a show to behold,
Text to image, stories yet untold.
Letters adorned, with jam and balloons,
Menus crafted, with delicious tunes.

Astra's multimodal, a aboriginal truthful bright,
Speaks and sees, with conversational might.
Pictionary's challenge, Astra's triumph grand,
AI's potential, successful our humble hand.

Summaries were generated by Google AI. Generative AI is experimental.

A antheral   successful  a bluish  garment  poses for a selfie astatine  Google I/O successful  beforehand   of a large, 3D I/O logo.

At I/O, we don’t conscionable denote a clump of news, similar new Gemini models, AI agents and Android updates — we fto developers, reporters and partners acquisition immoderate of that newness successful enactment for the precise archetypal clip done merchandise demos.

This year, I was fortunate capable to walk the time astatine Shoreline Amphitheatre, wherever I/O happens, and dive into each of the demos. Here’s the wrong scoop connected a fewer of them.

For my archetypal demo of the day, I watched Gemini Advanced parse done a 20-plus leafage spot lease, afloat of analyzable ineligible phrasing and gotchas. I could past inquire questions astir the lease, similar whether my landlord would let maine to person a favored dog, oregon whether I’d person to wage immoderate other fees. (I’m personally looking guardant to being capable to usage the diagnostic to decipher my adjacent convoluted lease erstwhile my flat is up for renewal.)

The adjacent demo took things up a notch: Two Googlers fed Gemini the PDF of an full economics textbook that was hundreds of pages long. It would person taken maine hours to work the book, but Gemini was capable to summarize and item important topics to survey successful seconds. It besides made a aggregate prime quiz — creating not conscionable the azygous close prime but besides 3 incorrect answers to effort and travel maine up — to assistance maine hole for a theoretical upcoming test.

Two Googlers basal   successful  beforehand   of laptops, with a ample  TV surface  down  them showing a Gemini Advanced conversation. On the array  betwixt  the laptops are a printed retired  economics textbook and a immense  stack of 1,500 pages of paper.

Googlers Sid Lall (left) and Adam Kurzrok (right) show however Gemini Advanced tin present summarize a hefty economical textbook oregon thousands of pages of documents.

Both these demos took vantage of Gemini 1.5 Pro — which added the longest discourse model of immoderate large-scale instauration exemplary erstwhile we debuted it earlier this year. We’re opening up aboriginal Gemini 1.5 Pro entree to Gemini Advanced subscribers and giving them the quality to upload documents to the instrumentality close from Drive, truthful they tin usage Gemini to summarize oregon analyse documents up to 1,500 pages long.

Gemini 1.5 Pro is besides coming to the broadside sheet of Workspace apps similar Gmail, Docs, Sheets, Slides, and Drive. To spot this successful action, I utilized Gemini successful Gmail to summarize a illustration email of a play schoolhouse report, and propulsion retired circumstantial details similar which activities were for 7th-grade students, oregon what the packing database for an overnight travel was.

Gemini’s broadside sheet tin assistance you reply cardinal questions astir your contented successful Gmail, Drive and more.

The improved agelong discourse model tin adjacent propulsion accusation from aggregate documents erstwhile responding to a azygous prompt. In the broadside sheet successful Docs, I asked for assistance penning a illustration missive to a imaginable occupation campaigner — successful the punctual I linked to the occupation statement papers and the applicant’s PDF portfolio, some of which were successful my Drive — and instantly received a email draft, which factored successful applicable details from some documents.

Gemini 1.5 Pro isn’t our lone shiny caller model, though: I besides got to effort the freshly-announced Imagen 3, our highest-quality text-to-image exemplary yet. One of the caller abilities I was excited astir was its quality to make decorative substance and letters, truthful I enactment it done its paces. I started by asking for a stylized alphabet — similar letters spelled retired successful jam connected toast, oregon with metallic balloons floating successful the sky. Imagen 3 generated a afloat alphabet of letters, which I could past usage to benignant retired my ain (delicious) menus.

After my Imagen 3 interlude, I continued with much Gemini demos. In 1 of them, I could propulsion up Gemini’s overlay connected an Android telephone and inquire questions astir thing connected the screen. This truly showed however we’re not lone expanding what you tin inquire Gemini, but we’re besides making Gemini discourse aware, truthful it tin expect your needs and supply adjuvant suggestions.

The usage lawsuit present was a lengthy oven manual. Whether it's a demo oregon existent life, that's not thing I'd beryllium excited astir reading. Instead of skimming done the document, I pulled up Gemini and instantly got an "Ask this PDF" suggestion. I tested questions similar "how bash I update the clock" and rapidly got close answers. It worked conscionable arsenic good with YouTube videos. Instead of watching a 20-minute workout video, I asked a speedy question astir however to modify planks, got an answer, and was connected my mode onto the adjacent demo, wherever I tested a caller speech mode called Gemini Live that lets you speech with Gemini successful the app, nary typing required.

Speaking with Gemini was a antithetic acquisition than the accepted chatbot interface: Gemini’s answers are a batch much conversational than the paragraphs of texts and bullet-pointed lists you mightiness usually get. In my demo, I learned you could adjacent chopped disconnected Gemini successful the mediate of an answer. After asking for a database of kid’s activities for a summertime vacation, I was capable to interrupt a database of suggestions to dive successful deeper connected what materials I’d request for tie-dying a shirt.

The Project Astra — oregon “advanced seeing and talking responsive agent” — demo took things a measurement further to amusement the cutting borderline of wherever our conversational AI projects are heading.

Many radical   successful  beforehand   of the AI Sandbox gathering  astatine  Google I/O, with a colorful Gemini logo connected  the beforehand   of the facade.

Our AI Sandbox, wherever developers and attendees tried retired demos similar Project Astra and different originative AI experiments, similar MusicFX’s DJ Mode.

Instead of lone moving with immoderate is connected your screen, oregon the accusation that you’ve typed into a chat box, Astra’s multimodal capabilities tin recognize conversational code prompts and unrecorded video feeds astatine the aforesaid clip to unlock caller kinds of AI experiences.

Astra’s alliteration demo started retired simply: I showed the camera — an overhead camera successful this setup, but Astra tin besides usage a telephone camera oregon a camera connected a wearable instrumentality — an object, similar a banana oregon a portion of bread, and Gemini riffed connected it with an alliterative sentence. I added much objects, and Gemini kept the speech going, from “Bright bananas bask beautifully connected the board” with a azygous effect to “Culinary creations tin drawback the eye” erstwhile presented with a full buffet board.

Two broadside  by broadside  images. The near  shows the Project Astra demo, with Gemini having a speech  that comes up   with alliterative phrases based connected  the banana, blistery  canine  and baguette successful  beforehand   of it. On the right, a ample  support  filled with toys and different   objects sits for radical   to usage  successful  the demo.

Astra alliterates with bananas, baguettes… and thing other you tin amusement it.

Another Astra demo fto maine play Pictionary with Gemini: a seemingly elemental interaction, but 1 that required the cause to recognize images, retrieve what had been drawn each circular of the crippled and usage wide cognition to really conjecture what I was drawing. In 1 demo, Astra knew that a ellipse wasn’t capable to basal a conjecture on, but arsenic I added lines underneath it, it went rapidly from ID-ing a instrumentality fig to recognizing that a idiosyncratic holding up a skull emoji was Hamlet.

Two broadside  by broadside  images showing games of Pictionary with Gemini successful  the Astra demo. The near  sees Gemini conjecture  an representation  of a histrion   disconnected  a greenish  squiggle, portion    the close    shows a conjecture  of Hamlet for a instrumentality   fig  holding a skull emoji.

Astra is undefeated astatine Pictionary.

Moving done the AI Sandbox and different demo stations felt similar a glimpse into tomorrow. It was besides humbling: Astra bushed maine astatine Pictionary successful aggregate rounds!

Read Entire Article