AI agents, multimodal Phi-3 unveiled at Microsoft Build 2024

Satya Nadella used his keynote address on Day 1 of Microsoft’s Build Developer Conference to announce some exciting new AI developments that will soon be generally available. Microsoft Build is an annual conference where developers get to see the latest developments in Windows 11 and Microsoft 365. The first day saw the unveiling of some interesting generative AI tools. Team Copilot In 2023 Microsoft released its Copilot chatbot which provides real-time intelligent assistance while you work with Microsoft 365 tools like Word, Excel, PowerPoint, Outlook, or Teams. Nadella announced that it was getting a significant AI upgrade with Team Copilot. The post AI agents, multimodal Phi-3 unveiled at Microsoft Build 2024 appeared first on DailyAI.

May 22, 2024 - 16:00
 6
AI agents, multimodal Phi-3 unveiled at Microsoft Build 2024

Satya Nadella used his keynote address on Day 1 of Microsoft’s Build Developer Conference to announce some exciting new AI developments that will soon be generally available.

Microsoft Build is an annual conference where developers get to see the latest developments in Windows 11 and Microsoft 365. The first day saw the unveiling of some interesting generative AI tools.

Team Copilot

In 2023 Microsoft released its Copilot chatbot which provides real-time intelligent assistance while you work with Microsoft 365 tools like Word, Excel, PowerPoint, Outlook, or Teams.

Nadella announced that it was getting a significant AI upgrade with Team Copilot. Team Copilot expands Copilot from an individual personal assistant to become part of a team, improving collaboration and project management.

If you’re working as part of a team using Microsoft Teams, Microsoft Loop, or Microsoft Planner, Team Copilot can facilitate meetings by managing the agenda and taking notes. It can highlight important information, track action items, and address unresolved issues.

It can even act as a project manager assigning tasks, tracking deadlines, and notifying team members when their input is needed.

Custom copilot agents

Microsoft Copilot Studio will enable you to build custom copilots that act as agents that work independently after you give them instructions.

Using a natural language prompt you simply describe what you want the agent to do and then deploy it on multiple platforms.

Microsoft says these agents can:

  • Automate long-running business processes
  • Reason over actions and user inputs
  • Leverage memory to bring in context
  • Learn based on user feedback
  • Record exception requests and ask for help.

An example of the utility an agent like this could provide is an “order-taker” copilot that Microsoft says could “handle the end-to-end order fulfillment process—from taking the order to processing the order and making intelligent recommendations and substitutions for out-of-stock items to shipping it to the customer.”

This functionality allows you to create virtual employees to handle menial tasks like monitoring emails, data entry, or other repetitive tasks without adding to your staff headcount.

Phi-3 Vision

Microsoft has added a 4.2B parameter multimodal model to its Phi-3 family of small language models (SLMs). Phi-3 Vision is a low-cost and low-latency model that has audio and vision capabilities and a 128k context window.

These smaller models are aimed at on-device solutions where speed, cost, compute, and internet connectivity constraints make larger models impractical. The Phi-3 SLMs display superior reasoning abilities and outperform several larger models.

Enabling on-device multimodal reasoning opens up exciting applications in healthcare, education, and agriculture, especially for rural areas with no internet connectivity.

You can try out Phi-3 Vision here. It does a great job of analyzing images, extracting text, and even translation.

Phi-3 Vision benchmark results compared to other AI models. Source: Microsoft

Advanced Paste

Windows 11 now has a smarter way to copy and paste. The new Advanced Paste feature gives you more options for data that you copy to the clipboard. When you press Windows Key + Shift + V you are presented with options to paste as plain text, as markdown, or as JSON.

You can also type a description of how you want the copied text to be processed before pasting.

You’ll need an OpenAI API key and credits in your account to use this feature. It just saves you the trouble of pasting the text into ChatGPT and prompting it to format it there, before copying and pasting it back into your document.

The post AI agents, multimodal Phi-3 unveiled at Microsoft Build 2024 appeared first on DailyAI.