GTT Notes for February, 2017: the Identifi App

Jason opened the meeting with a welcome, and gave a special welcome to those who attended via teleconference. The goal is to have as many people in the room as possible, but for those who can’t, we need to figure out whether it will be possible to have this option available for each meeting. Next month, the plan is to have a representative from Uber in to talk to us; that’s still in negotiation. The month after next, the presentation will be on Ciri. If anyone has any ideas of things they’d like to see covered, raise it. Get in touch by email at: gtt.toronto@gmail.com.

Tonight’s guest is the author of a free ap called Identifi, which is a smartphone ap that does recognition of various kinds. Anmol took the mike. Identifi allows you to use artificial intelligence to recognize objects, text, facial expressions etc. It’s available in 94 countries, and in 26 languages. The main screen has four quadrants. There are instructions and settings. There are three different object recognition modes, the first is basic, giving the least amount of detail. The second level would give more detail, and the third mode gives the highest level of detail: eg; laptop, laptop turned on, laptop and read the text showing on the screen. Responses take two to five seconds, depending on your internet speed, and the level of detail you’ve requested. He demonstrated text recognition by reading a paragraph from The Great Gatsby. It works best with printed text, but also with hand writing. He then demonstrated object recognition by photographing headphones, a banana, and a stapler. One of its unique features is that it recognizes not only objects, but colour, and background. Also, it’s unique for its speed. He did a few more demonstrations of object recognition. He then demonstrated text recognition in French.

He explained that the ap uses machine learning. The idea is that if you give a computer enough objects that are similar, it’s able to recognize objects. It compares what it’s seeing with what it has in its database. It contains 600 million images. The database comes from Stanford University, and is constantly growing. The photo can be sideways or up-side-down and recognition will still work. It works basically the same way as the human eye. As long as the text is in the image, it will be read out. Artificial intelligence is still new, so the ap is a long way away from actually understanding what it’s seeing and being able to answer questions about it. There are other limitations: taking a picture of an orange on an orange background for example might give it trouble. A white piece of paper on a white table top will also not work. At the moment, you can’t stop reading and start again, but he’s working on it. The text doesn’t actually appear on your phone, so Braille output isn’t possible at the moment, but it’s in development.

There’s no limit on the number of pictures you can take. You can take a picture of Braille and it will tell you that it’s Braille but it can’t interpret it. The ap recognizes 96 languages, but text-to-speech only functions with 26 languages, so that’s a limitation. The Cloudside API has 400 million images and is used for high detail mode. It could be used in a mall to read signs. It recognizes brand logos as well as text, which would help to orient yourself. You could also use it to read the mall directory. OCR uses Google Cloudvision API. Sentence construction of the recognition results was done by Anmol himself. The initial work is output in English, then translated into other languages as requested.

It’s being used in over 90 countries. He’s presented to institutions around the world including RNIB and CNIB. The ap has won awards from several universities, and has been recognized alongside with Google and Facebook for artificial intelligence. Anmol explained that he’s from India, and lived near a school for the blind. He was working in a summer job in artificial intelligence. These two things combined in his mind, and the ap resulted. This version was launched on the ap store last July. He’s in grade 12 of high school.

His plan is to work on version two. He wants to improve the user interface, add extra languages, and improve response times. A long-term idea is to figure out a source of funding, and also to create an Android version. He hopes to work on those things this summer. He seems disinterested in charging for the ap; he’d like to keep it free. One thing he might consider is an audio ad, which wouldn’t prohibit anyone from using it. There are minor costs involved, he’s spent a couple hundred dollars out of his own pocket. One member raised the problem that, when an object is not recognized, no voice message is given to tell the user what’s happening. Could there be an error message? Anmol replied that the phone might be on silent mode. On silent mode, the ap won’t work. The ap should always be giving you some response. Anmol said he would add an error message. Jason also pointed out that there are multiple steps, taking the photo, then tapping the “use photo” button.

When you open the ap, the take photo button is in the bottom right. Take a picture exactly as you normally would. Tap on the “use Photo” button, then two to ten seconds later you should get a response. You can also select a photo from your photo library. Under settings, you can configure how quickly the response is read out to you, or what language you prefer, as well as the level of accuracy you want. The instructions button on the top right will offer help. Once you’ve taken a photo, you can’t save it. Anmol isn’t considering adding this, because it would make an unnecessary step, and would engage privacy considerations. Jason added that, although you can use the “take picture” button, you can also use your “volume up” button, which will also take a photo.

A member raised the issue that saving a photo might not be worthwhile, but the text recognized might be something you want to keep. Someone else proposed including a “share” button, so that you could send the text somewhere else. Anmol replied that if there’s enough demand for it, he could include it. He recommended going to www.getidentifi.com and submitting requests. Anmol said he included the second step because in testing, sometimes people took pictures by accident, not meaning for them to be interpreted. It could read credit cards including numbers and logos. Information is never saved, so it’s secure. Neatly hand printed text can be recognized, computer fonts which are flowy can also be recognized, some cursive writing can also be recognized. As long as most of the object is in the photo, you’re probably good. For a small object, about a foot away is optimal. The flash is always on auto, so ambient light level shouldn’t matter. The suggestion was made to add a “donate” button. Anmol said that would cause problems around being incorporated.

Send questions or feedback to www.identifi.com/support

Leave a Reply Cancel reply