Some posts regarding Minerva got me thinking about artificial intelligence, and about how I need to make sure that I don’t get stuck trying to write something that I can’t actually accomplish. It’s unlikely that we’ll be writing the first fully functional AI for a My Dream App shareware application. :)
And it occurs to me that the lay-reader might not know what can and can’t be done in this area, and so, might not know what suggestions are feasible and which aren’t. So lets look at some specifics:
Some Specifics
Parsing an email word-by-word and looking for stuff like “meet <random words> at <numbers> <am or pm> on <day of week>” is very easy to do. It’ll find a suggested meeting time. But of course, if the email said “hook up” instead of “meet”, it wouldn’t find it. (see “Expert System” below). This is just keyword searching, and I am, as I’ve said before, the self-proclaimed “regex king of chicago”.
Likewise, summarizing things can be done easily, using Apple’s built in summary feature (look in your Services menu or at Safari RSS’s length slider). I don’t know (or care) how this works, but I’d guess it assigns an importance to each word in the sentences and adds up the numbers. The point is that it works decently well.
But abstracting a meaning from an email and then taking action based on that meaning, that requires real Artificial Intelligence. I can’t think of any good examples off the top of my head, but I’m sure some will come up. And I can say categorically that Minerva won’t be able to do them.
So to clarify my thinking on this, I’m going to talk in a generalized way about machine learning for a second. This is by no means exhaustive, or canonical, or hell, even necessarily correct — it’s all just off the top of my head. And it’s gonna be super-geeky, too, wheee!
Expert Systems
An expert system is a program that purports to be an “expert” on a given topic. It does this by doing a bunch of stuff like “is the word I’m currently looking at ‘golf’? Then say something in the category of <golf>”. If you’ve ever seen an instant messaging bot, or the program Eliza, or used one of those telephone support things that tell you to “state your problem” in a robotic voice, it was probably an expert system.
Likewise, text adventure games almost all use expert systems. The Douglas Adams game “Starship Titanic” had a huge one which managed to annoy most of its players. :)
An expert system can be a pretty convincing AI, but at heart, they’re stupid - they know only what the programmer put into them. The smaller the problem domain, the more effective they can be. This is why they can be used well for stuff like “stating why your cell phone isn’t working” - there are only so many possible replies.
Pros: Easy to program the basics. Cons: Stupid by default, smartness is directly proportional to programmer timeload and creativity. Programming the “smarts” is boring drudgery since it consists of a billion “if this, do that” lines.
(I’ve written an Eliza program. Easy to write, but all of the responses are essentially hand-coded. Mine was geared towards just insulting the user a lot since I wrote it as part of an English 101 project. And since the users always wind up cussing at them anyway.)
Using an expert system for My Dream App is feasible.
Artificial Neural Networks
Useful for classification and categorization. Basically, if you show it a bunch of things and tell it what “category” each thing belongs to, and then show it something new, it can give you a pretty accurate guess on what category the new thing will belong to.
Within certain constraints.
But those constraints tend to be pretty huge.
For example, let’s say you’re doing image recognition. If you show an ANN a billion pictures of houses (all located at the same spot in the picture, and all zoomed to the same level), and then you show it a new picture of a house, moved 10 pixels to the left, or zoomed 5%, it won’t recognize it at all. In other words, it depends on the “signal” always being in the same place with respect to the surrounding “noise”.
Even worse, the more variety you use in your training samples, the worse the overall classification becomes.
This leads to a problem known in the jargon as “feature extraction”. Feature extraction is the voodoo art of trying to abstract the “houseness” from a picture before sending it to the ANN. There are about a trillion schemes for feature extraction, which work better or worse depending on what it is that you’re trying to categorize. In fact, ANN schemes almost always boil down to this: the categorization part is easy, but extracting the features you actually want to categorize is hard.
If you’ve ever used a phone system that asks you to say your credit card number, it was likely using ANNs to map your vocalized numbers into “computer” numbers. And note that if you very slowly say “tttwwwwwwwwwwwwoooo”, it won’t recognize what you said - you’ve shifted the signal in time, so it can’t classify it.
Pros: Works very well for problems with a limited and well-defined number of choices, and with clean, non-shifted signals. Somewhat easy to implement. Cons: As the number of things to categorize goes up, resource usage (cpu and memory) balloons. Feature extraction can be difficult or impossible in the majority of problem domains.
(I’ve written several ANN programs. Moderately difficult to write, but gave good results. One was used in this robot to classify user interaction into “curiosity” or “fear” with good results.)
Using an ANN for My Dream App is unlikely to be feasible, unless it’s a very simple one. Apple has some frameworks that could possibly make it feasible, though, depending on use. I get the impression that ANNs are going out of fashion.
Stochastic Analysis
This one sounds intimidating, but is actually pretty simple. Basically, it boils down to this: “If the user did A, and then did B and then did C, well I’ll bet he’s going to do D next!” It’s the probability of a future action based on the set of past actions.
When you type text into a search box on Google and it suggests word-completions as you type, that’s stochastic analysis (of a very simple kind). When your white cat looks at your lap while you’re wearing all black and are about to go out on the town, you know from stochastic analysis that it’s about to jump into your lap. And from stochastic analysis, you know that your next move will be to get the lint roller.
When there’s a limited number of choices, stochastic analysis can be very powerful. But when it’s wrong, it’s flat-out wrong, and the effect can be jarring. Like when you predict, using stochastic analysis, that the other car coming up to the stop sign at the intersection that you’re currently approaching with the right of way is going to stop, but it doesn’t. At I said, jarring…
Baysian analysis (seen in Mail.app’s spam filtering) can be considered a form of stochastic analysis.
Pros: Moderately easy to implement while the number of possible events is remains small. Can be a very powerful prediction technique. Cons: As the number of possible choices goes up, training time and resource consumption goes way up. There are no half-measures — it’s right or it’s totally inaccurate.
(I’ve not written a stochastic analysis program, but I’ve used an off-the-shelf library. It analyzed incoming sounds looking for human vocal patterns (called “phonemes”) and tried to predict the next one, based on what came before. This is a standard technique in speech recognition, used for feature extraction. As I said before, feature extraction tends to be “the hard part”.)
Using stochastic analysis for My Dream App might be feasible, depending on complexity and need.
Genetic Algorithms
Useful for iteratively figuring out how to accomplish some goal. Basically, you define the goal as a set of testable attributes. From a real-world example, let’s say that the goal is for an underwater creature to get from point A to point B quickly. The quicker it arrives, the more likely it is to breed, passing on its traits to the next generation. Breeding consists of mixing traits together, along with random mutations. (Google for “Karl Sims”, for an old but famous example.)
With subsequent generations, performance towards the goal tends to increase, until eventually, something is decreed “good enough” and a winner is announced with great fanfare and celebration among the aquatic beasties.
This can be really, really hard to code. None of it is generalizable — it all has to be coded specifically for the situation you’re trying to solve. For example, how do you test how well something swims? By writing a physics engine that measures thrust against a viscous fluid. Very non-trivial. How do you convert genetic material into traits? Totally depends on the problem you’re trying to solve. (As an aside, if I’m not mistaken, the game “Creatures” used genetic algorithms.)
Also, stating the “goal” can be surprisingly difficult. For example, let’s say that the goal is to understand spoken words. How do you test that? How do you assign a score to that? It can be done, yes, but it takes some deep brainpower. (As another aside, an area where genetic algorithms have been used to good effect is in stock market trading. The goal is very easy to express: Make me money, and lots of it. I’m rich, bitch!)
Most importantly, genetic algorithms take time, and lots of it. You might need to go through millions of generations before you find a winner. That’s all happening in computer time, sure, but a million generations is going to take awhile even on the speediest hardware.
Pros: Can solve tough problems in unconventional or surprising ways. Cons: Very hard to implement, very hard to define, takes massive computing resources, takes large amounts of cortex, all around difficult.
(I’ve written a genetic algorithm program. Three years of coding and I never got it right. True, the demands for the particular program were insanely difficult, but the point is, it’s tough. Very tough. I blame C++.)
Genetic algorithms are infeasible for My Dream App.
I said, “Wrap!” “It!” “Up!!!”
Okay, so I’m not trying to make any grand point with this post - basically, my thinking is that knowledge is power, and given some non-technical knowledge about what can be done easily and what takes years of coding, you’ve got more power to make good suggestions. And hopefully, understand why certain things are infeasible - it’s not just caprice.
And I expect a working genetic algorithm project from each of you when we meet next Wednesday. :)




























Jason Harris
ShapeShifter/Chicken of the VNCJason Harris has been coding up spiffiness and silliness for about ten years, working on such diverse projects as a solid-state quantum computing simulator for electron waves in GaAs semiconductors and a Monte Carlo simulator for electron transport in nanostructure devices. He also wrote insane, down-to-the-metal microcontroller assembly language code for Octofungi, a robotic sculpture. In the Mac world, he's the primary author of ShapeShifter, Mighty Mouse, ThemePark, and heads the open-source Chicken of the VNC and Paranoid Android projects. He digs mountain biking, skateboarding, art, martinis, loud music, and creating oddly euphonious phrases. He never wears shoes if he can help it and can dance like a mofo!
View Jason Harris's Comments →