Results 1 to 21 of 21

Thread: Voice Activated Recording for the Nokia N900 / Linux Desktop.

  1. #1
    Join Date
    Feb 2006
    Location
    Amsterdam
    Posts
    13,691
    Mentioned
    146 Post(s)
    Quoted
    130 Post(s)

    Default Voice Activated Recording for the Nokia N900 / Linux Desktop.

    I had been looking a long time for a program that would start recording, but only save frames that contain sound which is ``louder'' than a specific threshold. I searched for programs. I found none that were open source. There weren't any closed source ones that run on linux or my n900 either... Only a few heavy programs for Windows. (Which were closed)

    Why do I want this? Well.. very often I have some great ideas or just plain things I want to remember. When I'm walking in the forest, sitting in bath, cycling in the city... (Or just sitting in my room, music playing doesn't bother it) Then I can't just write it down. So this is kind of a Memo application, but instead of typing, you just say whatever you want to say, and it will record that. It will obviously have to be on all the time, but it will only actually save the useful samples.

    So... I knew my phone could record because it had some "recorder" app. It worked quite well, but did not have this particular feature. I found the source, and it was written in C. It seemed to use GStreamer for recording, in combination with PulseAudio. (My phone uses PulseAudio)

    I was messing a bit with GStreamer, but it was a bit overkill and didn't send any WAVE headers either. So I fell back to a default linux program. (It's probably on your desktop too, if you run Linux): asound. This program simply dumps the raw .wav file to the standard output. So I made a subprocess in my python script which piped all the wave output to my program. My program will read 8000 samples at a time, - exactly 1 second -and then process it. If it contains at least one high and low tone, it will write the entire sample of one second to the output file. Additionally, it will always record one extra second after a second that contain a loud sound. If that extra second also contains loud sounds, it will also capture the next second, and so on.

    It was quite a task to do this, as I first didn't take a good look at my output data and started to do Fast Fourier Transformations, while I did not even need these.

    Simply run the program by either making it executable (chmod +x the_file) or call it like "python the_file".
    Obviously it will need a GUI and a system to manage all the output files, as it will currently override the last session. It should probably not write output to the file as it does now... More buffering would be more flash card friendly.

    Some credits go to Benland100 for flaming/teasing me when I was using FFTs to do this. He put me on the right track.

    E: Obviously this will work on any system that has asound and python.

    python Code:
    #!/usr/bin/env python

    import subprocess
    import wave
    import time

    #p = subprocess.Popen(['gst-launch-0.10', 'pulsesrc ! wavenc ! fdsink fd=1'], \
    #                stdout=subprocess.PIPE)

    # Arecord just dumps the raw wav to stdout. We will use this
    # to read from with out wave module.
    p = subprocess.Popen(['arecord'], stdout=subprocess.PIPE)

    # Open the pipe.
    f = wave.open(p.stdout)

    # Open file we will write to.
    o = wave.open('test.wav', 'w')
    o.setparams((1, 1, 8000, 0, 'NONE', 'not compressed'))
    o.setnchannels(1)

    high, lasthigh = False, False

    # Print audio set up
    print f.getparams()

    while True:
        # Read 1 second
        a = f.readframes(8000)
        b = [ord(x) for x in a]
        _min, _max = min(b), max(b)

        # Print bounds
        print 'min', _min
        print 'max', _max

        # TODO: The gate should obviously be configurable.
        if _max > 135 and _min < 120:
            high = True
        else:
            high = False

        # Write always if either is True.
        if lasthigh or high:
            o.writeframes(a)

        lasthigh = high

    f.close()
    o.close()
    Last edited by Wizzup?; 07-27-2010 at 12:05 AM.



    The best way to contact me is by email, which you can find on my website: http://wizzup.org
    I also get email notifications of private messages, though.

    Simba (on Twitter | Group on Villavu | Website | Stable/Unstable releases
    Documentation | Source | Simba Bug Tracker on Github and Villavu )


    My (Blog | Website)

  2. #2
    Join Date
    May 2006
    Location
    Amsterdam
    Posts
    3,620
    Mentioned
    5 Post(s)
    Quoted
    0 Post(s)

    Default

    Is this the answer of life?
    Verrekte Koekwous

  3. #3
    Join Date
    Jan 2010
    Posts
    5,227
    Mentioned
    6 Post(s)
    Quoted
    60 Post(s)

    Default

    My phone is a joke, but this really excited me. :3 I'd like to have a phone up to today's standards and be able to experiment with this.

    Great job! And it sounds really interesting. ^^

    Python Code:
    if max(b) > 135 and min(b) < 120:
      high = True
    else:
      high = False

    #Could simply be

    high = max(b) > 135 and min(b) < 120

    #no?
    Last edited by i luffs yeww; 07-26-2010 at 10:12 PM.

  4. #4
    Join Date
    May 2007
    Location
    knoxville
    Posts
    2,873
    Mentioned
    7 Post(s)
    Quoted
    70 Post(s)

    Default

    Quote Originally Posted by i luffs yeww View Post
    My phone is a joke, but this really excited me. :3 I'd like to have a phone up to today's standards and be able to experiment with this.

    Great job! And it sounds really interesting. ^^
    i have the same phone, cept its black

    its legit, does every thing i need
    <TViYH> i had a dream about you again awkwardsaw
    Malachi 2:3

  5. #5
    Join Date
    Dec 2008
    Posts
    209
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    nice.

  6. #6
    Join Date
    Jan 2010
    Posts
    5,227
    Mentioned
    6 Post(s)
    Quoted
    60 Post(s)

    Default

    Awkward, mine's black, too!

    /spam

  7. #7
    Join Date
    Feb 2006
    Location
    Amsterdam
    Posts
    13,691
    Mentioned
    146 Post(s)
    Quoted
    130 Post(s)

    Default

    Quote Originally Posted by i luffs yeww View Post
    My phone is a joke, but this really excited me. :3 I'd like to have a phone up to today's standards and be able to experiment with this.

    Great job! And it sounds really interesting. ^^

    Python Code:
    if max(b) > 135 and min(b) < 120:
      high = True
    else:
      high = False

    #Could simply be

    high = max(b) > 135 and min(b) < 120

    #no?
    Yes. And the *.close() methods on f and o can be removed as well, as the only way to exit the program currently it to use ctrl+c, which will stop the program immediately. (Python will call .close though) The loop should catch the Ctrl+c exception and then break though.



    The best way to contact me is by email, which you can find on my website: http://wizzup.org
    I also get email notifications of private messages, though.

    Simba (on Twitter | Group on Villavu | Website | Stable/Unstable releases
    Documentation | Source | Simba Bug Tracker on Github and Villavu )


    My (Blog | Website)

  8. #8
    Join Date
    Jan 2010
    Posts
    5,227
    Mentioned
    6 Post(s)
    Quoted
    60 Post(s)

    Default

    Thanks for the info.

  9. #9
    Join Date
    Feb 2006
    Location
    Amsterdam
    Posts
    13,691
    Mentioned
    146 Post(s)
    Quoted
    130 Post(s)

    Default

    Also, I'm not sure if asound always records @ 8000 samples. Probably not, so I'll have to change that as well. Will test tomorrow with my laptop.



    The best way to contact me is by email, which you can find on my website: http://wizzup.org
    I also get email notifications of private messages, though.

    Simba (on Twitter | Group on Villavu | Website | Stable/Unstable releases
    Documentation | Source | Simba Bug Tracker on Github and Villavu )


    My (Blog | Website)

  10. #10
    Join Date
    Dec 2006
    Location
    Banville
    Posts
    3,914
    Mentioned
    12 Post(s)
    Quoted
    98 Post(s)

    Default

    python Code:
    import magic
    The jealous temper of mankind, ever more disposed to censure than
    to praise the work of others, has constantly made the pursuit of new
    methods and systems no less perilous than the search after unknown
    lands and seas.

  11. #11
    Join Date
    Jul 2010
    Posts
    5
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Thanks for this. How do you change the recording parameters? Other than the "8000" in the line below which is obviously the sample rate, what do the other numbers control? I want to be able to change the bit depth to 16 per sample and the channels to 2. I'm no stranger to arecord, but I don't understand python.

    o.setparams((1, 1, 8000, 0, 'NONE', 'not compressed'))

  12. #12
    Join Date
    Jul 2010
    Posts
    5
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    I found this http://docs.python.org/library/wave.html.

    But when I change the line o.setparams((1, 1, 8000, 0, 'NONE', 'not compressed')) to o.setparams((1, 2, 8000, 0, 'NONE', 'not compressed')), the audio just becomes garbled.

    Also, changing the 8000 in both places in the script doesn't work, it just makes the audio sound like the chipmunks.

  13. #13
    Join Date
    Feb 2006
    Location
    Amsterdam
    Posts
    13,691
    Mentioned
    146 Post(s)
    Quoted
    130 Post(s)

    Default

    Quote Originally Posted by moser View Post
    I found this http://docs.python.org/library/wave.html.

    But when I change the line o.setparams((1, 1, 8000, 0, 'NONE', 'not compressed')) to o.setparams((1, 2, 8000, 0, 'NONE', 'not compressed')), the audio just becomes garbled.

    Also, changing the 8000 in both places in the script doesn't work, it just makes the audio sound like the chipmunks.
    Hmmm yeah. Keep in mind that if you have two channels you'll probably need to either select one channel or read from both, I suppose.

    If you can wait a bit (two days or so) I can make it more flexible.

    Also, the 1 you are changing to 2 is the sample width.
    Code:
    Wave_write.setparams(tuple)
        The tuple should be (nchannels, sampwidth, framerate, nframes, comptype, compname), with values valid for the set*() methods. Sets all parameters.
    Last edited by Wizzup?; 07-28-2010 at 09:15 AM.



    The best way to contact me is by email, which you can find on my website: http://wizzup.org
    I also get email notifications of private messages, though.

    Simba (on Twitter | Group on Villavu | Website | Stable/Unstable releases
    Documentation | Source | Simba Bug Tracker on Github and Villavu )


    My (Blog | Website)

  14. #14
    Join Date
    Jul 2010
    Posts
    5
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Thanks for responding. I understand what the 2 did. But I don't understand why changing to 16 bits/sample causes the recorded audio to go garbled. I also don't understand how to increase the sample rate. Frankly I also don't understand were arecord fits into this! Oh well, I will wait until you make more changes. Thanks very much!

  15. #15
    Join Date
    Feb 2006
    Location
    Amsterdam
    Posts
    13,691
    Mentioned
    146 Post(s)
    Quoted
    130 Post(s)

    Default

    Well if I decide to work on this any time soon, you can always find the latest version here: http://git.wizzup.org/?p=vacr.git;a=summary

    For now, it would probably be easier to specify a sample rate of 8000 in arecord. (Just pass "-r 8000" as second argument)

    eg:
    python Code:
    p = subprocess.Popen(['arecord', '-r 8000'], stdout=subprocess.PIPE)



    The best way to contact me is by email, which you can find on my website: http://wizzup.org
    I also get email notifications of private messages, though.

    Simba (on Twitter | Group on Villavu | Website | Stable/Unstable releases
    Documentation | Source | Simba Bug Tracker on Github and Villavu )


    My (Blog | Website)

  16. #16
    Join Date
    Jul 2010
    Posts
    5
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Thanks very much. I will follow this at your repo.

    How does one change the number of bits per sample? Right now it records using unsigned 8 bit, what would I do to change this to signed 16 bit? Putting -f S16_LE in the arecord line just causes an error like this:

    Code:
    arecord: main:495: wrong extended format ' S16_LE'

  17. #17
    Join Date
    Feb 2006
    Location
    Amsterdam
    Posts
    13,691
    Mentioned
    146 Post(s)
    Quoted
    130 Post(s)

    Default

    Quote Originally Posted by moser View Post
    Thanks very much. I will follow this at your repo.

    How does one change the number of bits per sample? Right now it records using unsigned 8 bit, what would I do to change this to signed 16 bit? Putting -f S16_LE in the arecord line just causes an error like this:

    Code:
    arecord: main:495: wrong extended format ' S16_LE'
    I was thinking about -s 16000. (Try arecord -s 16000 > test.wav) Ctrl+C it after a few seconds, and then play it with something that shows the rate.
    Last edited by Wizzup?; 08-01-2010 at 07:19 PM.



    The best way to contact me is by email, which you can find on my website: http://wizzup.org
    I also get email notifications of private messages, though.

    Simba (on Twitter | Group on Villavu | Website | Stable/Unstable releases
    Documentation | Source | Simba Bug Tracker on Github and Villavu )


    My (Blog | Website)

  18. #18
    Join Date
    Jul 2010
    Posts
    5
    Mentioned
    0 Post(s)
    Quoted
    0 Post(s)

    Default

    Yes, changing the sampling rate works fine, I also changed the number in the 2 other places in the script so the processing would make sense. But what about the bits/sample issue? I can't seem to pass that info to arecord, nor change it on the python line.

  19. #19
    Join Date
    Feb 2006
    Location
    Amsterdam
    Posts
    13,691
    Mentioned
    146 Post(s)
    Quoted
    130 Post(s)

    Default

    I'll get to this in a bit...



    The best way to contact me is by email, which you can find on my website: http://wizzup.org
    I also get email notifications of private messages, though.

    Simba (on Twitter | Group on Villavu | Website | Stable/Unstable releases
    Documentation | Source | Simba Bug Tracker on Github and Villavu )


    My (Blog | Website)

  20. #20
    Join Date
    Feb 2006
    Location
    Amsterdam
    Posts
    13,691
    Mentioned
    146 Post(s)
    Quoted
    130 Post(s)

    Default

    I've recreated the repository at Github, since I'm not going to buy a new phone anytime soon and I want to resume work on this.

    Also, for people who were waiting on a phone that will be as tweaker friendly as N900, I'd give up waiting and just buy a N900. It's receiving community updates and Nokia is soon going to remove it from it's product line, I think. But there won't be another phone like the N900 anytime soon now they've joined up Microsoft, I suppose.
    Last edited by Wizzup?; 02-12-2011 at 04:11 PM.



    The best way to contact me is by email, which you can find on my website: http://wizzup.org
    I also get email notifications of private messages, though.

    Simba (on Twitter | Group on Villavu | Website | Stable/Unstable releases
    Documentation | Source | Simba Bug Tracker on Github and Villavu )


    My (Blog | Website)

  21. #21
    Join Date
    Jan 2010
    Posts
    5,227
    Mentioned
    6 Post(s)
    Quoted
    60 Post(s)

    Default

    Unless Microsoft starts doing more open stuff like they've been doing. That could end out pretty sweet.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •