IMG_0127IMG_0129IMG_0128IMG_0122IMG_0121IMG_0120IMG_0119IMG_0118IMG_0117IMG_1888

I’ve seen Iron Man twice so far (it’s only been out for a week) and it’s a great movie.  I keep telling my friends that it’s my biography from the future, but they changed my name to Tony Stark for my protection.  One of the coolest things from the movie is Jarvis, the Artificially Intelligent “Butler” that interacts with future me Tony Stark.  They have conversations and Jarvis can basically do anything.  Very natural conversations… Jarvis can find out anything.

As soon as I got home from the movie (the second time) I started messing with the built-in voice recognition on my mac.  I now have a new side project once my thesis is done… Build Jarvis.

Granted… I won’t have anything remotely close to what is in the movie… but maybe I can figure out something that could have a reasonable conversation… I could add in “learning” later (if it’s even possible).  I would need something that resembles ALICE or Smarter Child… I used to mess around with those a couple years ago… maybe I can tie it together with the speech recognition.

41 Responses to “Jarvis from Iron Man”

  1. Dani Says:

    Build Jarvis. That was my first thought too.

  2. Wolf Says:

    Building Jarvis or something of that capacity would be a challenge, but possible. If you look at Jarvis as a model of AI you may not come close to building it, but if u look at Jarvis as a program which already has a call and responce format of user interaction within You could build something of the like. When u mention the speech recognition on your system, it is recorded to the wave patterns of your voice which when recognised, your words are placed against its database of language, and the most suitable word will be output. A simple access to the internet and keyword search for a specific word could also be possible if the “Find out anything” interface is desired. A special database within the structure of the internet which recieves daily downloads could keep your Jarvis up to date.. So as you can see, with the help of several programmers and a tad money, A programmable interface such as Jarvis could be achieved :). If you ever wanted to go ahead with such a project, I would be happy to assist.

  3. Martz Says:

    It would definitely be a challenge. If I were to attempt something it would be a dumb database version, not an artificially intelligent one. It may be possible to code a version that can add to its own database though. It’d probably be similar to a cache type system of your “find out anything”. If you ask it something it could search wikipedia/google for the information and store it for later so it wouldn’t have to access the internet. Or likewise you could just tell it about something… like teaching a child, and it could store that information from later. For instance: “Jarvis, I just bought the movie Iron Man”… later you could say “Jarvis, what movie did I buy last week?” and Jarvis would tell you you bought Iron Man.

    It would be a cool open source project. Feel free to run with it if you want.

    Thanks for all of your comments

  4. Wolf Says:

    Yes i agree with the concep of learning something in terms of storing recieved data, but this would prove a problem depending on how much data you limit the system to take on. You have to think about the storage space that yourself could get hold of. Another idea would depend on how open this system would be the public. If it was for yourself then the methods of storage would be up to yourself. On the other hand if this system was open the public, many amount of hosts could contribute the information their specific “Jarvis” has learned and place it within on database. This would allow every other system of its kind to access the same database and freely take and contribute more information. Think of it as a wiki if u like. As a user interface for the system we could create anything from a installable program to a Graphical User Interface. This would give more feel to the user but would indeed require more time.. I suppose the integration of c++ could simplify the process of creating a program of this type but i believe it would have to be set up as a full project.

  5. Martz Says:

    Storage wouldn’t necessarily be an issue. I think the english wikipedia size (not including images I think) is around a gigabyte. And whatever information you told the system would be converted to text (xml most likely) so that also wouldn’t take much space. The whole thing could probably fit on a large usb flash drive… BUT I do agree that it should be stored on the internet for multiple users to access. The trick would be to allow the program to have two or three data-types (probably): personal, networked (friends have access but not public), and public. Social networks stuff is all the rage now.

  6. Martz Says:

    I think the first step for something like this would be to just get a text entry setup working. So you could type and talk to the computer (much like smarter child/whatever) and add voice recognition/response on top of that. Would be easier to debug and could be done so that it would be more platform independent.

  7. Wolf Says:

    I Agree, Well there are several open source speach recognition engines out there to use for a working model of something like this. The most common problem with speech recognition is the recognisation of wave patterns. Most recognition software requires training the software to your voice. I think to make something like this unique, it could be built to recognise the structure of words in the format of say sylybles which could then be compared against a variety of words. This would be a great achievments

  8. Wolf Says:

    I believe a measure of AI could actually be put into this. If for instance u built a program that should access information it has stored itself, you could program it to create variables for each unit of information. This would allow the program to quickly access the information within itself. The AI would be getting a program to edit its own programming by adding these variables and access parameters for each one.

  9. Martz Says:

    I was thinking about this last night and I think to improve the accuracy of recognition the software should establish confidence levels for words/phrases and then do “verification” out of context of whats being said. So it’d attempt to increase confidence of low confidence words. Meaning if you say a sentence to the computer and one of the words has a low confidence level, it can go back and check to make sure the word it thinks it heard fits within the context of the entire sentence. I don’t know if anything out there does stuff like that, I’d imagine it does but it would be more computationally intensive.

    I’d look into this stuff some more but I defend my Masters Thesis on Tuesday so I’m still preparing for that.

  10. Martz Says:

    And yeah, I agree with comment 8, didn’t see it as I was writing comment 9

  11. Martz Says:

    Something else I’ve been thinking about… My two primary computers (both macs) have mics integrated into them, but if i’m playing music over the speakers I can’t use the microphone because it doesn’t cancel out the sound. There should be a way to take the waveform of the sound card’s output and do a subtraction from the microphone input (probably with some kind of delay built in). I wouldn’t THINK that that would be computationally intensive but it may be.

  12. Aris Says:

    ok you guys sound like smart ass ppl do you guys think that all of you together can make this system i will fund it and all wtv you guys need ill take care of it but if this is possible please email me back and ill make the necessary arrangements to come see you guys and one by one and then we can all meet and do this thing truth be told i have no logic of programs and shit like that i just really want this in my house so if you can email me back and you guys are serious ill pay you and we can have fun so write me an email if possible pure_grk@hotmail.com

  13. Wolf Says:

    As we said before Aris, a computer program of this magnitude is very possible. We are only currently looking at the logic behind creating one. The speech recognition is one of the main issues, especially taking into consideration what Martz has most recently suggested. I would be more than happy to go ahead on creating a program like this as it is much different to the computer games i normally assist in building. I dont want to speak for him, but Martz seems enthusiastic towards this as well.

  14. Aris Says:

    so its all good then so find out how many of you genius guys can help give me your emails and once i leave europe i will come see you guys one by one and we can speak about this in person unless you have another alternative but if you guys can do this we are gonna have a great time doing this so tell me what i can do to help and ill do it

  15. Wolf Says:

    HeHe well i will certainly e-mail you when we decide on the players involved, And i am in Europe :)

  16. Aris Says:

    really where in europe im in greece right now i can come meet you in person if you like and we can find more ppl and if you guys can ill take care of everything wtv you guys need money places to stay girls cars everything

  17. Matt Says:

    I just saw the movie for the first time the other night and I thought the same thing. I am looking very seriously into doing this. So far, my Windows Vista has Speech Recognition. I don’t know how well it would hold up for an AI though (knowing M$). I looked into PsyClone. That seems to be a good option. All I know is that I need Speech Recognition, Speech Synthesis, and then be able to allow the AI to learn, which some AI s are able to read Wikipedia to learn. Other AI’s are playing Robogames to learn. Its becoming reality. If you want to team up and try to create something like this, let me know. Email me.

  18. Aris Says:

    yes yes yes i want to team up to make this but unfortunatly i dont know nothing about it i just can fund anything you guys need so please if you can get a kick ass team together and email me and ill fly you guys down or ill come up i dont mind i just got a monster house and i want this in my house so email me back pure_grk@hotmail.com

  19. Ben Reilly Says:

    The strangest thing about reading this is the feeling of deja-vu. I thought exactly the same thing as you… and I started speech recognition on my mac. It quickly became annoying as it was hard to calibrate it properly.

    What I did do though is use my mac to read out of the weather for today and the next two days (rss from bbc website) every morning as part of my alarm (which plays music from my iTunes). I guess I was going for something similar to the wake up system Tony had with JARVIS.

    Dude, if you have any ideas or are actually working on something like this, I’d be happy to be in it with you as I’m quite determined to have a ‘Jarvis’ in my home in the future… my day job and busy life makes things a bit difficult!

  20. The Mask Says:

    I hope you succeed. Good Luck!

  21. BB22 Says:

    Hi guys, I’m another person who after seeing Ironman thought “OMG! I WANT THAT!” so I started looking into creating a Jarvis, which is when I came across your site. Been looking into it all a bit more each day. You guys should really check this out: http://www.automatedliving.com/default.htm
    It is a voice controlled home automation system and so much more. (Just to say I don’t work for them, just very impressed by it). It will control all your lighting and what not, plus read your emails, tell you whats on tv, answer your phone.. it will even run a bath for you if you have it set up to do so!
    Surely this is the closest thing to a “Jarvis” as you can get.
    Reading through your comments you guys seem to know what you’re talking about when it comes to programming. I’m sure this system could be “pimped up” in a way that can appear a bit more like Jarvis. The skin of the interface can be changed I’m sure, I think you can download more voices for it. There are sample on the website and they don’t sound too bad. Anyway tell me what you think. Look forward to reading your comments.

  22. Kyle Says:

    I thought the same thing when I saw the movie, I mean besides thinking that to build that suit would be awesome! building Jarvis would be an amazing break through in not only home automation, but also in assisted live for the elderly and the disabled. i would love to be in on designing and building this ui. I’m not an expert programmer or anything, but I am learning and I think that I have some good ideas, I just don’t exactly know how to implement all of them.

    something that you guys might want to look at along with the HAL system is an OS called linux media center edition. LinuxMCE. I’ve personally used it and its not that bad, its no Jarvis, but I believe that it would be a good stepping stone.

  23. khalid Says:

    Dear Dani and wolf i can assist financially i would like to help in building it with you guys. there is no amount of money i cant get. lets meet by
    the end of this year ill buy you guys the tickets and we would meet in Europe.

  24. Colin Says:

    Hi,

    Guys would love to see what progress or third party software you have come across. I too was very impressed with the technology in the movie. one such product i know of is OneVoice http://www.sayittoplayit.com/mcc.html I havent used it myself but it looks cool. Does require that you have windows media centre. Anyway i would love to know if anyone has found a touch button system like he used to gain access to rooms etc.

    regards

    Colin

  25. Paul Hunter Says:

    My mac speech is named jarvis and he wakes up every morning and tells me its time to get up and what day it is. He also announces the forecast/current temp.

  26. Justin Says:

    That was my first though as well. I knew I wouldn’t be the only one looking into this. How far have you gotten? Are you posting your progress somewhere?

  27. Justin Says:

    I am hoping you are still around. I have made some progress. I currently have a running version of ALICE called programD on my Mac. I am currently trying to get Applescript to run the terminal script that activates ALICE in terminal. With this Applescript, one should be able to place it in the Library/Speech folder, and when activated, Speech Recognition should read what is being outputted from the script. In theory this should work but I am not an Applescripter so it’s a little tough for me to create. Also, as far as interacting with the script by talking to it is unforeseen. I imagine though one could alter the Applescript to read what is outputted while you type your questions to ALICE through terminal. My hope is to integrate the OS functionality of Speech recognition with the personality and conversation database built into ALICE. What do you think?

  28. Ben Reilly Says:

    Hey Justin, I am actually taking the apple script route to having my own Jarvis too. At the moment I have a few simple ‘hard coded’ phrases in there. But I’m going to put the speech recognition aside and focus on the different things I want ‘Jarvis’ to handle for me. Also, if I can get sentence recognition down, the speech-to-sentence part can be handled at some point (and there’s already several things that can do that already probably…)

  29. Justin Says:

    Ben, maybe this will help. I found this while doing some research.
    http://bbs.macscripter.net/viewtopic.php?id=24662

    It gives simple tricks to setup multiple responses and a way to make “Jarvis” wait and listen for a command before he continues. I have one script that has him ask which browser I want to open and waits for me to say Safari or Firefox. Then he completes the command based on my response. But I am confused by what you mean by “speech-to-sentence.”

    I have AIML running on my computer. I used a version called ProgramD and setup the feature that allows you to communicate with your bot through iChat. iChat has the option to read your IMs to you so I can converse with my bot and it uses the same system voice to respond. I have speech recognition running computer commands and I can communicate with bixby through IM. In this way it feels like it is the same system but it is still two separate programs working to accomplish the same goal. Essentially I would like to have it all centralized in one program. For instance, I would like the personality one can get from AIML running through speech recognition to give me back different responses instead of preprogrammed ones found in the script. ProgramD runs in terminal and Applescript works very well with terminal so the ability is there. One roadblock will be, since Speech Recognition was never setup to compose text based on what you say except for the preprogramed commands you setup, a one to one conversation with “Jarvis” may not be possible without a texting application. But this current setup is quite nice and I am growing attached to my “Jarvis”

    I feel bad hijacking this guy’s blog. Feel free to contact me by following the link associated with me name in this comment.

  30. Martz Says:

    Hey guys, Sorry I haven’t really been paying attention to this. Been really busy with my new job. If you want you can feel free to use my wiki at http://wiki.iswhatithink.com/ I only have my grad research up there, so there’s plenty of room to post scripts or whatever. If you end up taking it somewhere else instead please post a link so I can check it out. I may try to contribute at some point when things slow down. Good luck,

    M

  31. Ben Reilly Says:

    Outstanding Justin. Your AIML sounds interesting. Am I right in thinking your programD in AIML is a purely conversational tool that can connect to your terminal and your iChat? Whilst bixby is what you use for speech recognition (and can also work through iChat?)

    I’m actually building mine using Java and applescript. So I can have a cool Jarvis-like front end. So instead of speaking you type text. What I mean by speech-to-sentence is basically what you were saying about having a ‘texting’ application. The way I’m designing this at the moment appears to be dependent on that.

    Having said that, I like the link you posted. Listening out is a good idea, though I’d need to send the responses to my java app so I handle logic there. Because it could get quite messy solely in applescript. Also, with the ‘listening’ idea, you have a finite set of responses…

    You’re right we are hijacking this space…

  32. TheflyingdutchMan Says:

    Hi,
    Great thought to build jarvis. if you need any help with programming jarvis, just ask :)

  33. Tendai Muswere Says:

    iron man iz my fav movie aswell
    im a bot of a computer nerd and it has come to my attenton tha building jarvis would be nowhere near impossible

    all you would need to do is configure IMbots like smarterchild to learn how people talk and teach it how to respond in a conclusive mannor by gettin it to recgonise common phases then teach it to recognise commands then to compile all the instant messages that IMbots recive and im sure there are millions meaning they will be a lot more accurate than jarvis and you would also need to buld a speec engin for it

    it will demand a lot of processing power and storage but the ps3 has already got the poer and capabilitys so in a few years there will deffenatel be something better than jarvis

  34. Damon Says:

    I have been thinking about this for months and have also been playing with my VISTA speech recog. This would be a great project and Im goign to try and learn so C+ and start from scratch or at least get a boot program running.

  35. Richard Varley Says:

    Hi guys/girls,

    Some good ideas here. Was looking over the internet for AI software and to see what direction most people are going, backwards from what I can see so far. I have been working on a voice link system for a few years on the side in hope to be able to talk to a system even if you are playing music loud ect.

    Have had a few problems here and there but with each failure its one step closer to getting it right.

    The problem I get atm is not to cancel out sound from around me but to get most speach software to work with changes in a voice pattern, It would seem having a cold could stop it working.

    The setup im using without giving anything away right now is using a bluetooth link to a mic ( like a hands free kit for a mobile) as this is close to the body sound would be more clear to the software or AI.

    On e problem i am getting is the server space, The ammount of data needed right now for a voice pattern is really high. My database has already bust 2TB.

    Right now most responses I get from a AI system have to be in a script format with a voice pattern linked into it for the or any system to respond. Much like you finding an *.exe file to run.

    With regards to Jarvis and the wit he used in the film from what I know it would have to be set as a standard response to a certain program or action you have asked the system to run or execute. You may get fed up hearing the same problem over and over.

    Anyone else got any ideas on how a voice pattern can be kept to a smaller size. As it stands 1 command can be upto about 25Mb. I know its not huge but when you link multiple commands to a command file it can get a little big. A few commands on my server are about 800Mb.

    Just looking for a way forward or any other ideas here if you have any.

    Thanks all

    richardvarley@hotmail.com

  36. duff Says:

    I’m trying to build a jarvis for my self anything you guys could offer for advice would help. i’m putting in a home automated system in a week and would like to up grade it to be like jarvis in the movie

  37. physicistjedi Says:

    Check out:
    http://physicistjedi.deviantart.com/art/Jarvis-1-0-106307065

  38. This is what i think » Blog Archive » Jarvis Part 2 Says:

    [...] linking this back to a post I made a few months ago:  Jarvis from Iron Man It was a popular post apparently.  So far it has 38 comments from people wanting a similar [...]

  39. Martz Says:

    I just posted a new entry related to this overall topic:
    http://this.iswhatithink.com/2008/12/jarvis-part-2jarvis-part-2/

    Hope everyone is doing well. Happy Holidays

  40. Aaron Harris Says:

    Reading all these posts it is very pleasing to see that there are many people like myself that want such a computer that would seamlessly make aspects of our lives effortless. I am also working on such a program, I am working with AIML. AIML is the best route to go right now, you can start with Dr. Wallace’s “ALICE” brain and tweak it and train to to your exact liking. This makes it easy for basically anyone to create a chatbot and use the mac speech recognition to converse with the bot. (I recommend mac speech dictate) With this topic comes many variables, variables that are later in the stages of making this “program.” How would you have this technology everywhere you went? without running out of a power source? my first thought were a bluetooth device but if you think about it that idea runs into even more problems. in loud environments when there is noise interfering. I am continuing to work with AIML as it is the cheapest and easiest for me as this present time BUT i believe to have a “jarvis like” system at your finger tips, it will take more than a program giving the illusion of intelligence to accomplish this. We have yet to uncover the true workings of AI. Just think of how the human brain works, and how we hear things and interpret them, we may not remember them exactly but if it were to come up again we have a reaction in our brain saying ” hey this rings a bell.” This is what AI is, a “computer brain” that starts as a child and listens and learns human like activity.

    Sorry for the blabbing, these are just a few of my ideas and beliefs on this subject. I hope this helps someone somewhere. Feel free to correct and comment.

  41. damian Says:

    I’ve already started building my own “jarvis”, Like you all said it is a challenge but one person (or two people if you want someone to do the voice of the program) can do it with enough determination I cannot make a learning program however I have given the program internet access to find out info on whatever I ask, it uses a voice recognition program then searches both yahoo and wiki answers and takes the answer that comes up most often and outputs it. It is currently unfinished and has limited responses for every question I could ask but does pick from randomly from twenty different possible answers (this is for non internet questions) and I sometimes find myself asking it an opinion of my clothes because it seems to be having a real conversation at which point I have it say “sorry no response available” It is still very far from being finished and only works on vista. Oh and every time I install a new programonto my laptop I have to write extra code (that’s a lie I use a module library.) but it’s shaping up to be pretty decent.

Leave a Reply

Jarvis from Iron Man

May 8th, 2008 |