Quantcast
Channel: Kodeco | High quality programming tutorials: iOS, Android, Swift, Kotlin, Unity, and more
Viewing all articles
Browse latest Browse all 4400

Building an iOS App like Siri

$
0
0
Learn how to build an iOS app like Siri

Learn how to build an iOS app like Siri

Siri is a voice-based personal assistant that takes voice commands and performs tasks like sending messages, making phone calls, setting alarms, searching on the web, finding directions and more.

Unfortunately at the time of writing this tutorial there isn’t an official API by Apple, but there are few third party frameworks that allow developers to include functionality similar to Siri.

In this tutorial, you’ll learn how to use the Nuance Dragon Mobile SDK (SpeechKit), which is one of the leading frameworks in this arena. Specifically, you’ll learn:

  • The key concepts of SpeechKit framework
  • The advantages of SpeechKit when compared to other frameworks
  • How to perform speech recognition and text-to-speech synthesis using SpeechKit APIs.

You’ll use this knowledge to build a Siri-like application to help your users find nearby restaurants, narrowed down by cuisine or category. Here’s a demo video of the app you will be building:

What is SpeechKit?

The SpeechKit framework is a high-level framework with two major components for developers: the speech recognizer and the text-to-speech synthesizer.

The framework carries out the following processes:

  • The audio component manages the audio system for recording and playback to give user feedback.
  • The networking component manages the connection to the server and automatically re-establishes timed-out connections.
  • The end-of-speech detector determines when the user stops speaking and automatically stops recording.
  • The encoding component manages the streaming audio’s compression to reduce bandwidth requirements and decrease latency.

SpeechKit follows a server-based architecture and relies on the Nuance speech server for voice recognition and text-to-speech synthesis.

  • For voice recognition, the SKRecognizer class sends audio streams to the server which then returns a list of text results.
  • To convert text to speech, the SKVocalizer class sends the text to a server and receives an audio playback.

SpeechKit supports 38 languages for speech recognition and 40 languages for the text-to-speech synthesis. Both male and female voices are available for many languages.

SpeechKit Framework Quick Reference

The SpeechKit iOS framework has the following four classes and three protocols that it uses for speech recognition and text-to-speech synthesis.

Class References

  • SKRecognizer: This is the primary class for voice recognition. It records the user’s voice, transmits it to Nuance Speech server and obtains lists of matching texts. Only one recognition process happens at any point of time. Generate a new instance of this class for any subsequent recognition.
  • SKRecognition: This object contains voice recognition results, scores corresponding to each recognition result and suggestions for the user. Suggestions can also ask the user to speak slowly or loudly depending on the environment in which user is speaking.
  • SKVocalizer: This class is for text-to-speech synthesis, and it ensures an application can be initialized to speak a language or voice. It supports 40 different languages and speaks in both male and female voices.
  • SpeechKit: This class configures the SpeechKit subsystems, maintains a connection with speech server, and initializes the required audio components that record a user’s voice or playback text-to-speech. This class doesn’t provide any instance, and hence, shouldn’t be initialized.

Protocol References

  • SKRecognizerDelegate: These delegate methods maintain the flow of the recognition process. They detect when the recognition process begins and finishes, as well as when the app receives results or an error from the server.
  • SKVocalizerDelegate: These methods provide information regarding the speech process. Essentially, they are optional as their primary use is to detect error. The vocalizer queues sequential speech requests, hence, it is not necessary to know when the speech request has finished.
  • SpeechKitDelegate: This protocol allows the delegate object to observe the state of the SpeechKit process. Primarily it’s used to monitor if the process is being destroyed.

Getting Started

Start by downloading the starter project for this tutorial.

Open Main.storyboard and you will see that I have pre-created the user interface for you so you can stay focused on SpeechKit:

PrecreatedUI

The view of ViewController has a table view as a subview where you’ll display search results. Each TableView cell has two imageViews for thumbnail and rating images, and two labels where the app will display the name and address of a restaurant.

Build and run the application, and you should see a view with a textField and a mic button. The tableView underneath will not yet have any contents. That’s your job!

Starter App Screenshot

The app you’re building will help make your users’ lives easier by making it simple for them to find somewhere to eat and drink. No longer will they have to rely on old-fashioned typed out searches or local knowledge, now they’ll just be able to speak what they want into their iPhone and your app will save the day.

Your app will suggest nearby restaurants based on the type of cuisine the user craves, or by a category the user defines. Sounds simple enough, but your app will need to perform five primary tasks:

  • Get the user’s current location
  • Take a user’s voice input to search restaurants by cuisine or category
  • Use the Yelp API to look for restaurants that match
  • Display results
  • Provide voice feedback to the user

Aside from the user interface, the starter project also includes the code to obtain the user’s current location and connect to Yelp’s API. Cool!

Now you need to get your own Yelp API authentication token and secret key, as well as a Nuance application key.

Getting a Yelp API Token

Go to the Yelp Developer portal and click on Manage API Access.

Yelp API Token

Log In using your Yelp developer account and password. If you haven’t registered already, go ahead and Sign Up for a new account.

Yelp API Token

Fill in the details and continue.

Yelp API TokenYelp API Token

Check your inbox for an email from Yelp. Click on the verification link to verify your email.

Yelp API Token

Once you’ve signed up, Log In to your Yelp developer account and click the Manage API Access link again.

Fill up the details on API Access screen and submit for an API token.
Yelp API Token

By default, you’ll have an API v1.0 token, but you’ll need an API v2.0 in this project. Click on Request API v2.0 Key to generate API v2.0 token.
Yelp API Token

Take note of the consumer Key, consumer secret, token and token secret on this screen.

Yelp API Token Secret

If you already have a Yelp developer account, then log in and just click on Manage API Access Keys to find your API keys.

Yelp API Token Secret

Back in Xcode, open OAuthAPIConstants.h which is under Yelp_API_Request group in the sample project and update the API consumer key, consumer secret, token and token secret.

Getting a Nuance Development Key

Next, go to the Nuance Mobile Developer portal, and click on Register to create a new account.

Nuance Token

Fill out the details and continue.

Nuance Token

Provide a brief summary of the application you’re going to build, and remember to select iOS as the platform. You can choose multiple platforms if you have big plans to for cross-platform development.

Nuance Token

Feel free to leave the bottom section on this screen empty.

Nuance Token

Click on Get Started to create your account. You’ll receive an email at this point. Follow the steps to activate your account and login.

Nuance Token

Once you’ve successfully logged in, you can download the iOS Mobile SDK.

Nuance Token

You’ll receive another email with your SpeechKit Application key and setup details. Be sure to take a note of the Host, Port, AppID and App Key.

Nuance Token

There’s one last step before you can start coding – you need to add the SpeechKit framework to your project.

Adding the SpeechKit Framework

Start by extracting the DragonMobileSDK zip file that you downloaded when you registered with Nuance. The extracted directory will have the SpeechKit framework named as SpeechKit.framework, along with some sample projects.

Next, select the YelpNearby project file in the Project Navigator and then select Build Phases. Expand the Link Binary with Libraries section and click the add (+) button.

Adding_SpeechKit_Framework

In the pop-up window, click on Add Other… button, and then locate SpeechKit.framework in the directory that you extracted in step-1. Click Open to add the framework to the sample project.

SpeechKit Framework

Allright – you’re finally done with the initial setup, time to code! :]

Speech Recognition with SKRecognizer

Open ViewController.h in Class group, then import SpeechKit.h framework header:

#import <SpeechKit/SpeechKit.h>

Also declare that the ViewController class is going to use SpeechKitDelegate and SKRecognizerDelegate:

@interface ViewController : UIViewController <UITextFieldDelegate, UITableViewDelegate, UITableViewDataSource, SpeechKitDelegate, SKRecognizerDelegate>

Finally, declare an SKRecognizer property:

@property (strong, nonatomic) SKRecognizer* voiceSearch;

Next, switch to ViewController.m. Open the email from Nuance or refer to the Development Key that you got from the Nuance Developer Portal, and add your SpeechKitApplicationKey before the @implementation section:

const unsigned char SpeechKitApplicationKey[] = {INSERT_YOUR_APPLICATION_KEY_HERE};

Be sure to replace the INSERT_YOUR_APPLICATION_KEY_HERE with your unique application key that you got in the email earlier. It should be a long list of hex numbers.

Next open AppDelegate.h and import the SpeechKit.h framework header:

#import <SpeechKit/SpeechKit.h>

Also replace setupSpeechKitConnection method in AppDelegate.m with the following:

- (void)setupSpeechKitConnection {
     [SpeechKit setupWithID:INSERT_YOUR_APPLICATION_ID_HERE
     host:INSERT_YOUR_HOST_ADDRESS_HERE
     port:INSERT_YOUR_HOST_PORT_HERE
     useSSL:NO
     delegate:nil];
 
     // Set earcons to play
     SKEarcon* earconStart	= [SKEarcon earconWithName:@"earcon_listening.wav"];
     SKEarcon* earconStop	= [SKEarcon earconWithName:@"earcon_done_listening.wav"];
     SKEarcon* earconCancel	= [SKEarcon earconWithName:@"earcon_cancel.wav"];
 
     [SpeechKit setEarcon:earconStart forType:SKStartRecordingEarconType];
     [SpeechKit setEarcon:earconStop forType:SKStopRecordingEarconType];
     [SpeechKit setEarcon:earconCancel forType:SKCancelRecordingEarconType];     
}

Input the actual APPLICATION_ID, HOST_ADDRESS and PORT you have received from Nuance while setting up SpeechKit (it’s easiest to just copy/paste the code from the email).

The code you added here is new, so let’s review it a bit.

  • setupWithID:host:port:useSSL:delegate: initiates the necessary underlying components of SpeechKit framework. Upon calling this method, the app connects with the speech server and receives authorization. This provides the basis to perform recognitions and vocalizations.
  • setEarcon:forType: configures an earcon, which is a distinctive sound that represents specific events. In this case, the earcons play the respective audio cue when SpeechKit starts, stops and cancels a recording.

Next, you’ll add the three earcons that will play during the recognition process.

To do this, select Supporting Files, right click and select Add Files to “YelpNearby”…, then locate DragonMobileRecognizer sample project that’s part of DragonMobileSDK.

Earcons

Select the earcon files that end with .wav, make sure to Copy items into destination group’s folder (if needed) is checked, and click Add to add the files to your project.

Add_Earcons_2

Back in ViewController.m, replace the implementation of viewDidLoad method with the following:

- (void)viewDidLoad
{
    [super viewDidLoad];
 
    self.messageLabel.text = @"Tap on the mic";
    self.activityIndicator.hidden = YES;
 
    if (!self.tableViewDisplayDataArray) {
        self.tableViewDisplayDataArray = [[NSMutableArray alloc] init];
    }
 
    self.appDelegate = (AppDelegate *)[UIApplication sharedApplication].delegate;
    [self.appDelegate updateCurrentLocation];
    [self.appDelegate setupSpeechKitConnection];
 
    self.searchTextField.returnKeyType = UIReturnKeySearch;
}

What does that do? updateCurrentLocation updates the user’s current location after the view loads, so the app always has the user’s current location. You don’t have to set up SpeechKit manually, because it’s configured by setupSpeechKitConnection.

Next, replace the recordButtonTapped: method with the following:

- (IBAction)recordButtonTapped:(id)sender {
    self.recordButton.selected = !self.recordButton.isSelected;
 
    // This will initialize a new speech recognizer instance
    if (self.recordButton.isSelected) {
        self.voiceSearch = [[SKRecognizer alloc] initWithType:SKSearchRecognizerType
                                                    detection:SKShortEndOfSpeechDetection
                                                     language:@"en_US"
                                                     delegate:self];
    }
 
    // This will stop existing speech recognizer processes
    else {
        if (self.voiceSearch) {
            [self.voiceSearch stopRecording];
            [self.voiceSearch cancel];
        }
    }
}

recordButtonTapped: is called when user taps on the Mic button

Let’s review the method creates a new SKRecognizer instance:

- (id)initWithType:(NSString *)type detection:(SKEndOfSpeechDetection)detection language:(NSString *)language delegate:(id <SKRecognizerDelegate>)delegate;

This method takes 4 parameters:

  • type: This allows the server to anticipate the type of phrases the user is likely to say and select an appropriate vocabulary of words. Note: When a user pulls up speech recognition, usually he or she intends to search for a specific thing or dictate a command to the device, such as taking a note or posting something to social media. All of these actions are defined in SKSearchRecognizerType and SKDictationRecognizerType respectively.
  • detection: This automatically detects when speech stops. Possible values can be SKNoEndOfSpeechDetection, SKShortEndOfSpeechDetection, SKLongEndOfSpeechDetection which define ‘do not detect the end of speech’, ‘detect end of short phrase’ and ‘detect end of longer phrase,’ respectively.
  • language: This definies the language spoken by user and is expressed in ISO 639 code followed by ISO 3166-1 country code.
  • delegate: The delegate, which is the receiver for recognition responses.

stopRecording stops the existing recording and streaming audio to the speech server. cancel stops all speech requests, even those that are pending.

Next, add the following SKRecognizerDelegate methods in ViewController.m:

# pragma mark - SKRecognizer Delegate Methods
 
- (void)recognizerDidBeginRecording:(SKRecognizer *)recognizer {
        self.messageLabel.text = @"Listening..";
}
 
- (void)recognizerDidFinishRecording:(SKRecognizer *)recognizer {
        self.messageLabel.text = @"Done Listening..";
}

The code in recognizerDidBegingRecording: and recognizerDidFinishRecording: methods update the status message to let the user know whether the application is listening for a voice or has finished listening.

The most important delegate method in this whole implementation is didFinishWithResults:, which is called when the recognition process successfully completes. Add the following code to implement this method:

- (void)recognizer:(SKRecognizer *)recognizer didFinishWithResults:(SKRecognition *)results {
        long numOfResults = [results.results count];
 
        if (numOfResults > 0) {
           // update the text of text field with best result from SpeechKit
           self.searchTextField.text = [results firstResult];
        }
 
        self.recordButton.selected = !self.recordButton.isSelected;
 
        if (self.voiceSearch) {
            [self.voiceSearch cancel];
        }
}

In the above method, the results object contains an array of possible results. It places the best result at index 0, or returns an empty array if no error occurred but no speech was detected. Then it updates the text of the searchTextField with the best result.

Once this process finishes, it has to cancel the recognition request in order to stop SKRecognizer from listening further.

But what do you do if the user’s speech is unclear because they are on a busy street, mumbling indiscriminately because they are hungry beyond measure, or there is another error of some kind?

If the recognition process completes with an error, the app needs to handle it and let the user know something went wrong. Add the following snippets to alert the user when there’s an error:

- (void)recognizer:(SKRecognizer *)recognizer didFinishWithError:(NSError *)error suggestion:(NSString *)suggestion {
        self.recordButton.selected = NO;
        self.messageLabel.text = @"Connection error";
        self.activityIndicator.hidden = YES;
 
 
        UIAlertView *alert = [[UIAlertView alloc] initWithTitle:@"Error"
                                                        message:[error localizedDescription]
                                                       delegate:nil
                                              cancelButtonTitle:@"OK"
                                              otherButtonTitles:nil];
        [alert show];
}

Build and run the application now on a device (the SDK does not work properly on the simulator).

Once the application launches, tap on the Mic icon and say something clearly in English. SpeechKit will detect it and display the result. Since this is your first time, why not start with a polite “Hello?”

Starter App Screenshot

Note: If when you try to record it immediately says “Cancelled” and shows an error like “recorder is null” or “[NMSP_ERROR] check status Error: 696e6974 init -> line: 485″, this probably means either something is wrong with your SpeechKit keys, or the SpeechKit servers are down. Double check your keys, and/or try again later.

Using Yelp Search to Find Matching Restaurants

Once SpeechKit can recognize a voice, you need to identify the search keyword and find matching restaurants using Yelp’s API.

To do this, start by opening ViewController.h and import the YelpAPIService.h header that contains the definition of YelpAPIService class:

#import "YelpAPIService.h"

Next, declare that ViewController class will use YelpAPIServiceDelegate:

@interface ViewController : UIViewController <UITextFieldDelegate, UITableViewDelegate, UITableViewDataSource, SpeechKitDelegate, SKRecognizerDelegate, YelpAPIServiceDelegate>

The YelpAPIServiceDelegate provides the loadResultWithDataArray: to detect when the Yelp search finishes and provides an array of matching restaurants.

Next, declare a few properties and methods:

@property (strong, nonatomic) YelpAPIService *yelpService;
@property (strong, nonatomic) NSString* searchCriteria;
 
- (NSString *)getYelpCategoryFromSearchText;
- (void)findNearByRestaurantsFromYelpbyCategory:(NSString *)categoryFilter;

This declares a property for the YelpAPIService included in the starter project to interact with the Yelp API, a property for the search string, and two methods that will find Yelp search categories and matching restaurants.

Next, switch to ViewController.m and implement getYelpCategoryFromSearchText as follows:

- (NSString *)getYelpCategoryFromSearchText {
    NSString *categoryFilter;
 
    if ([[self.searchTextField.text componentsSeparatedByString:@" restaurant"] count] > 1) {
        NSCharacterSet *separator = [NSCharacterSet whitespaceAndNewlineCharacterSet];
        NSArray *trimmedWordArray = [[[self.searchTextField.text componentsSeparatedByString:@"restaurant"] firstObject] componentsSeparatedByCharactersInSet:separator];
 
        if ([trimmedWordArray count] > 2) {
            int objectIndex = (int)[trimmedWordArray count] - 2;
            categoryFilter = [trimmedWordArray objectAtIndex:objectIndex];
        }
 
        else {
            categoryFilter = [trimmedWordArray objectAtIndex:0];
        }
    }
 
    else if (([[self.searchTextField.text componentsSeparatedByString:@" restaurant"] count] <= 1)
             && self.searchTextField.text &&  self.searchTextField.text.length > 0){
        categoryFilter = self.searchTextField.text;
    }
 
    return categoryFilter;
}

getYelpCategoryFromSearchText extracts a category or keyword from search text, by looking for a particular pattern. For example, if the user says, “Japanese restaurants nearby” or “nearby Japanese restaurants” or “Japanese restaurant” or “Japanese restaurants” it’ll detect the keyword “Japanese” and pass that to the Yelp API.

The code in the above method splits the best possible search result by space and by taking the word that precedes ‘restaurant.’ For a more complex application, a complete set of grammar may be specified to fit the context or search category. For many applications, the search text is whatever the user says.

Next add this new method:

- (void)findNearByRestaurantsFromYelpbyCategory:(NSString *)categoryFilter {
    if (categoryFilter && categoryFilter.length > 0) {
        if (([CLLocationManager authorizationStatus] != kCLAuthorizationStatusDenied)
            && self.appDelegate.currentUserLocation &&
            self.appDelegate.currentUserLocation.coordinate.latitude) {
 
            [self.tableViewDisplayDataArray removeAllObjects];
            [self.resultTableView reloadData];
 
            self.messageLabel.text = @"Fetching results..";
            self.activityIndicator.hidden = NO;
 
            self.yelpService = [[YelpAPIService alloc] init];
            self.yelpService.delegate = self;
 
            self.searchCriteria = categoryFilter;
 
            [self.yelpService searchNearByRestaurantsByFilter:[categoryFilter lowercaseString] atLatitude:self.appDelegate.currentUserLocation.coordinate.latitude andLongitude:self.appDelegate.currentUserLocation.coordinate.longitude];
        }
 
        else {
            UIAlertView *alert = [[UIAlertView alloc] initWithTitle:@"Location is Disabled"
                                                            message:@"Enable it in settings and try again"
                                                           delegate:nil
                                                  cancelButtonTitle:@"OK"
                                                  otherButtonTitles:nil];
            [alert show];
        }
    }
}

This method accepts a search category and uses the Yelp API to look for nearby restaurants that might quell the user’s hunger pangs. It’s usually a cuisine or category of restaurant, for example Chinese, Japanese, Barbecue, Sandwiches, Indian, etc. that the method passes, as well as the user’s latitude and longitude.

Almost done. Add the following line of code in the recognizer:didFinishWithResults:results delegate method below the line self.recordButton.selected = !self.recordButton.isSelected;:

// This will extract category filter from search text
NSString *yelpCategoryFilter = [self getYelpCategoryFromSearchText];
 
// This will find nearby restaurants by category
[self findNearByRestaurantsFromYelpbyCategory:yelpCategoryFilter];

The above code makes use of the methods that you implemented in steps 1 & 2; they use them to get a search category and execute a Yelp search.

YelpAPI calls loadResultWithDataArray: once it has a response, so as a final step let’s implement that.

# pragma mark - Yelp API Delegate Method
 
-(void)loadResultWithDataArray:(NSArray *)resultArray {
    self.messageLabel.text = @"Tap on the mic";
    self.activityIndicator.hidden = YES;
 
    self.tableViewDisplayDataArray = [resultArray mutableCopy];
    [self.resultTableView reloadData];
}

Once the application has Yelp’s response, it reloads the tableView with the results. cellForRowAtIndexPath: is already implemented in the sample project, as it displays a thumbnail, name, address and rating of each restaurant as received from Yelp.

Build & Run. Once the application launches, tap on the Mic icon and speak sentences like ‘Japanese Restaurants”> or “Chinese Restaurants nearby” or whatever kind of restaurant you’d like the app to find for you.

Final App Screenshot

Note: If you don’t get any results, it could be there are no restaurants in Yelp’s database nearby your location. Try to choose a restaurant type you are sure is nearby you.

Text-to-speech Synthesis Using SKVocalizer

Now you’re almost there! For this next exercise, you will learn how to use SKVoicalizerDelegate for text-to-speech synthesis.

In ViewController.h, declare that ViewController is going to use SKVocalizerDelegate. The ViewController delegate declaration should look like this:

@interface ViewController : UIViewController <UITextFieldDelegate, UITableViewDelegate, UITableViewDataSource, SpeechKitDelegate, SKRecognizerDelegate, YelpAPIServiceDelegate, SKVocalizerDelegate>

Declare these two properties:

@property (strong, nonatomic) SKVocalizer* vocalizer;
@property BOOL isSpeaking;

This declares a property for the vocalizer, and a BOOL that will keep track of the status of text-to-speech process.

Next, in ViewController.m add the following code in the else section of recordButtonTapped::

    if (self.isSpeaking) {
         [self.vocalizer cancel];
         self.isSpeaking = NO;
    }

When the user taps the record button, the above codes will stop the current speech — if there’s one in progress — and cancel all pending speech requests.

Now, add the following code at the end of loadResultWithDataArray:resultArray::

    if (self.isSpeaking) {
    [self.vocalizer cancel];
}
 
self.isSpeaking = YES;
// 1
self.vocalizer = [[SKVocalizer alloc] initWithLanguage:@"en_US" delegate:self];
 
if ([self.tableViewDisplayDataArray count] > 0) {
    // 2
    [self.vocalizer speakString:[NSString stringWithFormat:@"I found %lu %@ restaurants",
                                 (unsigned long)[self.tableViewDisplayDataArray count],
                                 self.searchCriteria]];
}
 
else {
    [self.vocalizer speakString:[NSString stringWithFormat:@"I could not find any %@ restaurants",
                                 self.searchCriteria]];
}

These lines of code configure a new SKVocalizer. Why? The text-to-speech synthesis uses SKVocalizer to make your application speak text, and with this app it’s the number of restaurants it found.

This happens in 2 steps:

  1. First, you need to initialize the vocalizer object using initWithLanguage:language delegate:.
  2. Second, you make the vocalizer object speak something using speakString: method. You’ve already added the code required to initialize and speak the text in previous step.

Next, add the SKVocalizer delegate methods to cancel the vocalizer if there’s any error.

- (void)vocalizer:(SKVocalizer *)vocalizer willBeginSpeakingString:(NSString *)text {
    self.isSpeaking = YES;
}
 
- (void)vocalizer:(SKVocalizer *)vocalizer didFinishSpeakingString:(NSString *)text withError:(NSError *)error {
   if (error !=nil) {
      UIAlertView *alert = [[UIAlertView alloc] initWithTitle:@"Error"
			                              message:[error localizedDescription]
			                             delegate:nil
		                            cancelButtonTitle:@"OK"
		                            otherButtonTitles:nil];
      [alert show];
 
      if (self.isSpeaking) {
         [self.vocalizer cancel];
      }
    }
 
    self.isSpeaking = NO;
}

These methods are called when the vocalizer starts and stops speaking, respectively. Note you set the isSpeaking flag appropriately here.

Guess what – you’re finally done! Build and run, tap on the Mic icon and ask the app to find restaurants, for example, “Japanese Restaurants” or “Chinese Restaurants nearby.” The results should be similar to your previous results, but this time SKVocalizer will make your application say how many restaurants it found.

Final App Screenshot

Note: If you don’t hear SpeechKit saying anything, make sure your device isn’t in silent mode, and that the volume is turned up. This is an easy mistake to make because the initial messages that come from earcons like ‘listening’, ‘done listening’ will play even if your device is in silent mode, which might make you not realize your device is in silent mode.

Watch out – your app now talks back!

Comparison with Other SDKs

Remember that SpeechKit is just one of many frameworks you can use for speech detection and vocalization. There are a number of other frameworks you may want to consider:

  • OpenEars: It’s FREE but only has support for English and Spanish.
  • CeedVocal: It’s available in 6 languages (English, French, Dutch, German, Spanish and Italian). It’s not free, but you can buy licences per application or pay a one-time fee of 6000 Euros for an unlimited number of applications. Each license is good for one language, but you can add additional languages as needed. Try working with the trial version first.
  • iSpeech: This SDK has support for 20 languages as of now, and is free for mobile apps.

Both OpenEars and CEEDVocal share the issue of low function in a noisy environment. Out of all these frameworks Nuance’s SpeechKit supports more languages, understands different accents and remains useful when there is a lot of background noise.

However, there are two scenarios where you may need to consider other frameworks. If your application needs to process speech recognition in offline mode, then you might like OpenEars. All other frameworks require network connectivity.

SpeechKit requires that port 51001 be left open. Some non-Windows firewalls, and various antivirus software may disable port 51001. This will cause a connection error while using SpeechKit. If that occurs, you need to configure the software to open the port.

Where To Go From Here?

Here is the finished sample project with all of the code from the above tutorial.

Congratulations, you have successfully integrated Nuance Mobile SDK and built an iOS application like Siri!No doubt you’ve worked up an appetite, so why not test out your cool new app and see if it can help you find somewhere grab a victory snack or frosty beverage?

You can dive deeper and find more about Nuance Mobile SDK under the resources tab in the Nuance Mobile Developer portal. Make sure you check out the iOS developer guide and iOS API reference to learn more about the speech recognition process.

If you have any questions, comments or find a unique requirement about your voice recognition application, feel free to leave a comment here and I will be happy to help.

Building an iOS App like Siri is a post from: Ray Wenderlich

The post Building an iOS App like Siri appeared first on Ray Wenderlich.


Viewing all articles
Browse latest Browse all 4400


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>