Amazon has launched three new cognitive services
- Rekognition – Object and facial analysis
- Polly – Text into Speech
- Amazon Lex – Chatbot for voice and text
Amazon Rekognition is a service that makes it easy to add image analysis to your applications.
Four functions are provided in this API:
- Object and Scene detection: Rekognition identifies various interesting objects such as vehicles, pets, or furniture, and provides a confidence score.
- Image Moderation: It detects adult content in the image and provides suitable labels for the adult content detected.
Cons : Does not classify images with violence/bloodshed as adult content.
- Facial Analysis: You can locate faces within images and analyze face attributes, such as whether or not the face is smiling or the eyes are open with certain confidence scores.
- Face Comparison: Rekognition lets you measure the likelihood that faces in two images are of the same person. Cons: The similarity measure of two faces of the same person depends on the age. Also localised increase in the illumination of face alters the results of face comparison.
Amazon Polly is a service that turns text into lifelike speech. Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products.
- 47 voices and 24 languages can be used and Indian English option is provided.
- Tones whispering, anger, etc can be added to particular part of the speech using “amazon effects”.
- We can also instruct the system how to pronounce a particular phrase or word in a different way. Ex : W3C pronounced as World Wide Web Consortium. We can also give the input text in SSML format.
Amazon Lex is a service for building conversational interfaces into any application using voice and text.
Cons: There is no synonym option and there is not so proper entity extraction and intent classification.
Note: Amazon has not launched speech to text conversion API so far.