1. Introducing Microsoft Cognitive Services
Without a doubt, artificial intelligence (AI) is an important part of information technology today. It certainly will be more and more important in the future, but its already being used in many ways, so you, as a developer, should learn about what tools and services are available to build next-generation applications.
Most of the worlds software giants offer AI solutions, and Microsoft has an interesting range of services and tools that will simplify the way you build and implement solutions based on artificial intelligence. This chapter provides a high-level overview of what Microsoft provides for AI, with a detailed description of the Cognitive Services APIs. This serves as the base for the next chapter, where you will walk through the Computer Vision API.
Introducing the Microsoft AI Platform
Microsoft provides the AI Platform ( www.microsoft.com/en-us/AI/ai-platform ), a set of services and tools that applications can consume across platforms. The AI Platform includes services for creating bots; services for machine learning and deep learning; and services for analyzing pictures, real-time videos, and speeches.
More specifically, the Microsoft AI Platform includes the following:
The Bot Framework, which allows you to build and connect conversational bots and create natural interactions with users ( http://dev.botframework.com/ ).
Cognitive Services, a set of RESTful services capable of recognizing, understanding, and interpreting the content of pictures, speeches, live videos, written text, and much more, with a natural language description ( http://azure.microsoft.com/en-us/services/cognitive-services/ ).
Azure Machine Learning , a robust cloud platform that helps developers build their own custom AI solutions ( http://azure.microsoft.com/en-us/services/machine-learning-services/ ).
In the next section, you will get an overview of Cognitive Services ; then, in Chapter , you will start working with the Computer Vision API, which is the real focus of this book.
Introducing Microsoft Cognitive Services
Microsoft Cognitive Services are RESTful services that allow for natural user interaction on any platform on any device.
The Cognitive Services APIs perfectly embody the conversation-as-a-platform vision that Microsoft strongly believes in, by providing a rich set of APIs that allow for processing human language, sentiments, emotions, physical characteristics, audio, and much more. At a higher level, the Cognitive Services APIs are grouped into the categories in Table .
Table 1-1.
Categories of Microsoft Cognitive Services
Service Category | Description |
---|
Vision | These APIs provide image-processing algorithms that help identify, caption, moderate, understand, and describe pictures and videos with a natural language description ( http://azure.microsoft.com/en-us/services/cognitive-services/directory/vision/ ). |
Knowledge | These APIs help you implement a customers knowledge by finding events, locations, academic papers, and recommendations tailored to a customers needs ( http://azure.microsoft.com/en-us/services/cognitive-services/directory/know/ ). |
Language | These APIs are capable of processing natural language, evaluating sentiments, and understanding a customers needs ( http://azure.microsoft.com/en-us/services/cognitive-services/directory/lang/ ). |
Speech | These APIs enable audio processing with speaker recognition, voice verification, and audio conversion into text ( http://azure.microsoft.com/en-us/services/cognitive-services/directory/speech/ ). |
Search | Based on the Bing search engine services, these APIs allow you to implement image search, news search, video search, and autosuggestions ( http://azure.microsoft.com/en-us/services/cognitive-services/directory/search/ ). |
Each category contains a number of specialized sets of APIs. Describing all these sets is out of the scope of this book; therefore, you can read more by visiting the related web pages for each category. It is worth mentioning the available APIs in the Vision category, because this book focuses on the Computer Vision API, included in this category, so that you have an overview of what these APIs can do. Table summarizes the specialized APIs available in the Vision category.
Table 1-2.
The Vision APIs
API | Description |
---|
Computer Vision API | Provides image-processing algorithms that help you understand, analyze, and describe images with natural language response. It includes optical character recognition (OCR) and celebrity recognition ( http://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/ ). |
Content Moderator | Provides automated content moderation for images, videos, and text ( http://azure.microsoft.com/en-us/services/cognitive-services/content-moderator/ ). |
Video API | Provides powerful APIs that are capable of improving video quality as well as detecting and identifying faces and other elements within videos ( http://azure.microsoft.com/en-us/services/cognitive-services/video-api/ ). This is currently a preview service. |
Video Indexer | Allows you to maximize video interactions and insights, helping make video content more discoverable ( http://azure.microsoft.com/en-us/services/cognitive-services/video-indexer/ ). This is currently a preview service. |
Face API | Detects, identifies, analyzes, and organizes faces in an image ( http://azure.microsoft.com/en-us/services/cognitive-services/face/ ). |
Emotion API | Detects peoples emotions in an image, based on face detection ( http://azure.microsoft.com/en-us/services/cognitive-services/emotion/ ). |
Custom Vision Service | Enables custom image processing based on tags and labels ( http://azure.microsoft.com/en-us/services/cognitive-services/custom-vision-service/ ). This service is currently in preview. |
The Cognitive Services APIs are offered through the Microsoft Azure cloud platform, including the Computer Vision API discussed in this book. As an implication, you will need an active Azure subscription to work with such services. You can request a free Azure trial at I will explain how to configure your Azure subscription to get your personal access keys.
Introducing Development Tools and Platforms
Based on the REST approach and on the JSON standard data exchange format, Cognitive Services can be potentially consumed by any application, on any device, on any operating system, and through any development platform and programming language that supports both REST and JSON.
As a developer working with the .NET technologies, you can consume such services in any kind of .NET application and with all the .NET languages such as C#, F#, and Visual Basic. Having said that, you have three major options.
On Windows, you can use Visual Studio 2017 as the development environment for full support to all the .NET project types. If you do not have an MSDN subscription, you can download the Community edition for free ( www.visualstudio.com/downloads/ ).