TagLib, symlinks, and an optimized upload queue

The biggest piece of news this time around is that I've managed to integrate TagLib, the super versatile audio file analysis and tagging library into SoulseekQt. Finding TagLib was a pretty major happy accident. I was showing SoulseekQt to a co-worker, and his second question (after: can you search for FLAC files?) was, does it show FLAC file audio properties in search results? No, I said, we only really analyze MP3 files for audio properties. But that's a good idea. A-googling I went for a C/++ library that can analyze FLAC files and TagLib showed up almost immediately. On top of indexing FLAC file audio properties, TagLib also does that for (begin near copypasta:) MP3, MPC, MP4, ASF, AIFF, WAV, TrueAudio, WavPack, Ogg FLAC, Ogg Vorbis, Speex and Opus files. Goodbye my own personal MP3 analysis function, hello TagLib. Though audio attributes for all these file types should already show up in your own private share (I've only really tested MP3s, MP4s, FLACs and Oggs), the only audio attributes I'm currently indexing are bitrate and play length, the latter one which has been lost to us since the dawn of SoulseekQt. Whether the file is VBR or not, interestingly, is not information that's provided by TagLib and so that no longer shows up, but I feel the benefits more than outweigh this loss. I'd be happy to add other audio attributes that you folks feel might be useful, so let me know.

Another bit of good news for those of you who prefer organizing their shared folders via symlinks, those are back in play. Scanning symlinked folders was disabled at some point because it would occasionally create inifinite scan loops. I'm making a list of the real location of each scanned subfolder and checking it twice to prevent those kinds of loops, so hopefully this'll address that.

Finally, I've sort of reworked the way the transfers are processed on the upload queue. The recent "upload small files immediately" feature created a performance problem that led me to slow down the speed at which the client processes new upload requests. That was not ideal, so I moved things around and now hopefully the whole thing should be faster.

These are all very sensitive changes, so I'm expecting more problems than usual. I'll be fixing anything that comes up ASAP, so keep me posted.

Windows: https://www.dropbox.com/s/vymgj3fe9qww72g/SoulseekQt-2015-1-21.exe?dl=0
Mac: https://www.dropbox.com/s/1pvykqpgsxhlxn9/SoulseekQt-2015-1-21.dmg?dl=0

Linux builds are pending as I've yet to get TagLib built on Linux.

Thanks, Nir

1/21: Fixed a bug related to scanning filenames with international characters. Included latest language translations.

Comments

psynaturecybine's picture

beautiful, something unexpected! will run and note found bugs over the coming weeks. thanks!
(linux version is as always awaited)

Thanks psy!

psynaturecybine's picture

1. in search result mp3 attribute is kbps and it's showing well. time length is not shown for all mp3's though.
2. for flac attribute column is empty. why not filling it with "sample rate, length"?
3. .db is on my excluded list, but is still downloaded when folder is selected from search results (from Browse it's blocked properly). Is this behavior correct? .db file should be at least unchecked by default in "download files dialog", or even not shown there if figures as excluded from download.

Regarding 1 and 2, assuming that both relate to search results, bear in mind that this new functionality affects your own shares only right now. Once this new version is out we should hopefully see more and more extra attributes for shared files as users upgrade.

3 sounds like an unrelated, but genuine bug. I'll look into it over at the weekend at the latest.

Thanks! Nir

This build crashed on me within a few minutes while scanning my flac files. Maybe it ran out of memory. I changed the save client data every x minutes to a higher number, so I will try re-scanning.

Update: it crashed a second time during the scanning process.

Hmm. Are you monitoring memory consumption? How much does it consume when it crashes?

Also, do you have the diagnostics tab turned on? That one takes *a lot* of memory.

Yes, I always have the diagnostics tab turned on. I will try scanning with it off.

Getting crash during scan. Diagnostics tab on is not affecting memory consumption - about 100,000 kb and up to 15% CPU, so not exactly stressing!

Is it anything to do with number of files? The scan log shows the crash happens at the same point each time.

Oh - and have finally got it to run to an application error window (silent failing first three times)!

---------------------------
SoulseekQt.exe - Application Error
---------------------------
The instruction at 0x76fd9d8c referenced memory at 0x0c81f000. The memory could not be written.

Click on OK to terminate the program
Click on CANCEL to debug the program
---------------------------
OK Cancel
---------------------------

Disabling Diagnostics did not fix my crashing problem with the last build, (I posted a new crash log to my forum post), but I guess you can ignore that for now, so far so good on this build. Thanks for your amazing work!

MELERIX's picture

I noticed some small increment in CPU usage, now is constant in 1.2% or 1.4%, is this normal ?

Not sure what in the new build could account for it. The upload queue is busier so I guess that might be it if you regularly have people queuing stuff from you.

Anyone who's getting crashes on Windows, if you can please follow the instructions here:

https://docs.google.com/document/d/1WxE8ZQmTH8UqdM8WaxDfKf6N128ukm_lINMS...

to generate a crash report. If you manage to get one, post it here or send it to me as a private message.

Thanks!

Typical. Debug version set up and running, and its behaving.....
I'll leave it running for the day and see if anything goes amiss.

That's interesting.....
Just looked at the file scan log, and noticed that its doing a case-sensitive A-Z-a-z scan for the flac and ape files? The folder it had crashed on before is about 2/3rds of the way down the list. Goes from 'Yes' to 'arvo part' not long after, then finishes ok.

I updated the link to the release build in the article, just in case you want to try that one too.

Thanks, Nir

No - that one crashed overnight.
I'll go back to running the debug build

dogbite's picture

I think all of this TAGlib is unnecessary. things where fine so far, thats my opinion. It Might even slow down slsk. but since its already in the process here are my suggestions: MP# should have bitrate and if its in VBR it should simply show V0, V1, V2 instead of the 140-217kbps or alike. i cant explain how awesome that would be. lossless files like flac/wav should only show bit and frequency in 24/96 form. i believe thats it. why would i wanna see some unimportant intel. length is really not that important, i mean i would sooner make a size filter than a length filter.

psynaturecybine's picture

indeed bitrate for lossless is useless, should be in form mentioned above

+1 on showing LAME encoding settings (V0, V1, etc.) if that info is in files with LAME tags, but -1 on trying to infer such things from the bitrate of any other file. When encoding VBR, the bitrate varies greatly depending on how tonally and spatially complex the sound is. Quiet, mono, and tonally simple music can be encoded with the highest quality setting and still come out with a very low bitrate because that's all it took to achieve the target quality (by whatever objective measure the encoder uses for "quality"). Aside from the encoder-specific settings stored in LAME tags, the INFO or XING header on VBR files can have a "quality" value from 0 to 100, which may or may not be filled in by the encoder with anything it wants, and there's no standard so you can't compare the numbers produced by different encoders. The VBR header can often be found on CBR files as well, but CBR is normally encoded without regard to quality, so whatever "quality" number is written there is useless.

Bit depth and sample rate would be nice for lossless, I agree. Display format is not too important, although I agree if both are together, bit depth should come first.

Length is important for me. I appreciate it in search results because the file names often don't convey accurate (or any) info about song versions, especially for edits and compilation appearances.

I also use taglib in my application and I extended it to give me the vbr info. And yes, I'd like bitrate and song length as accurate as possible. If I'm looking for a longer version of a song the bitrate itself doesn't help me much.
Thanks for your work.

Do you read the VBR info from the Xing header? Don't those only appear in MP3s encoded with Xing? I seem to remember that one had a pretty bad rep.

The XING header is the main info header for VBR files, no matter what encoder was used. Also LAME creates a XING header if encoding VBR files. The only other VBR header that was used but has quite disappeared by now is VBRI. For CBR files LAME creates an INFO header that is almost like XING but with very few changes.
However Taglib already knows how to scan these headers. The only missing piece was that it should store the fact that the bitrate is VBR. That's what I added.

Thank you, that's really useful information. It's a shame TagLib doesn't provide that information in XingHeader if it's available... I'll look into making the same modification too.

This is weird, I modified XingHeader to extract quality information if it's present, but i seem to always be getting quality 0. I only tried with a few hundred MP3s though. This is the modified version of XingHeader::parse:

https://gist.github.com/nirslsk/b4f28409c9ea1aa3c14b

Do you see anything wrong with it?

Scratch that, looks like TagLib only reads the first 16 bytes of the header, so it was returning 0 as a way of refusing to read outside the byte buffer. I changed it to read 120 bytes and now I'm getting quality info!

dingspro's picture

(Debug copy still not crashed after 1 day plus!)

Are the shares rescanned on startup?
Assume they are, but that would not stop another user's list being invalid. I spotted a folder that had been moved weeks ago, showing up on the 'not found' diagnostics tab, so I did a rescan and around 40 files were listed as removed.
Now, I restarted the debug version a day or two back after 1.21 had crashed overnight, but I don't remember doing any housekeeping of that volume during that time.

It did make me notice two things. The file rescan window holds its z index too high. even if another window is brought to the front during this, the rescan sits on top of it. the width also seems to hold the longest folder/file name size, even thought that's a bit pointless.

I'd keep the debug version going. If the release build crashes chances are it will as well. It's really just a release build with debug information so it should function about the same.

The shares do get rescanned when the client is started, not sure what happened there...

Not sure, if that would help, but static code analyzers have helped me to find one or the other error. A free one would be
http://cppcheck.sourceforge.net/

gibbs's picture

Yes but using TagLib will expand the audio properties shown beyond just mp3. Sounds good to me!

gibbs's picture

That was meant to be @dogbite

Touch wood, the non-debug version is working ok.
Debug version didn't crash for three days, but go interrupted by a reboot, so I went back to the 21 build.

That's good to hear. I'm still worried about that crash though, maybe someone will eventually be able to produce a crash report.

Thanks, Nir

error trapped!!!!

That is amazing, thank you. It actually looks like a potential bug in the C++ standard template library! This one includes a workaround that might fix it, debugging instructions are same as before:

https://www.dropbox.com/s/3boesqn3uqg6mru/SoulseekQt-2015-2-4-debug.7z?dl=0

Running new one now....

crashed again, but no stack to dump - lots of invalid parameter passed to C runtime function, which started at a new thread.
[New Thread 36152.0x1d298]
warning : invalid paramter.....

several of these, then
[Thread 36152.0x1d298 exited with code 0]

several more parameters

then all threads exit with code 3

finally

[Inferior 1 (process 36152) exited with code 03]

I've screen dumped these, just in case you might want to look at them?

If you don't have the gdb.txt file then sure, I'll take a look at the screenshot.

Thanks, Nir

Thanks! Yeah, there doesn't seem to be anything there other than the thread messages. May or may not have something to do with the crashes... these are probably file transfer threads. I've been working the last couple of weekends to transition SoulseekQt to non-threaded file transfers, so maybe that will make the client more stable. I'll post a link to a new build as soon as it's ready.

Non-threaded file transfer: https://www.dropbox.com/s/7zemnt7ngivasof/SoulseekQt-2015-2-8-debug.7z?dl=0

If you're completely sick of testing it's totally cool. I'm gonna do some testing on my own and then make a proper announcement on the front page to try and get some feedback.

Thanks, Nir

new debug up and running.....
Hey, testing is no different to running normal. Just takes a little while to do a normal close.

Great, thanks! This is the debug version of the build that fixes the issue of upload speed limit not being divided by the number of slots:

https://www.dropbox.com/s/41nvk29blf8jvrb/SoulseekQt-2015-2-9-debug.7z?dl=0

For symlinks and shortcuts, I'm curious to know what the indexing logic is. There are a lot of decisions to make about what to follow and what to ignore. Avoiding loops is only part of what needs to be done.

What if the target has NTFS 'system' or 'hidden' flags set or is actually missing or unreadable? Ignore the link, I hope.

What if the target of a public link is userlist-only, or not shared at all? The target better not be visible or accessible to anyone I haven't allowed it to be shared with.

Even loop avoidance causes problems. I assume you are avoiding indexing a folder if it was already indexed. What's the point of following folder links at all, then? And unless you delay symlink-following until after all the real folders have been scanned, you could end up with a situation where whichever path happened to be indexed first is the one that "wins". This can invalidate people's queued downloads if they wanted a folder that's no longer indexed by the time their turn comes up.

Windows shortcut filenames end in .lnk. If you transfer the target file but give it the link's name without stripping .lnk extension, the downloader's system will think it's a shortcut file. I saw this happening when shortcut following was enabled. You need to think about how to index and transfer .lnk files if you are following them like symlinks.

If nothing else, I suggest making a toggle for disabling the following of symlinks entirely, so that if there are problems or unexpected complications, we can continue to use Soulseek as we have been. I do want to help make symlinks work, but if they're causing problems, I need to be able to shut 'em off.

Qt makes working with shortcuts on Windows a lot more difficult than symlinks on Linux/OSX (you basically can't use a shortcut directly as a folder), so for the time being indexing shortcuts is turned off in the latest nightly builds. You make a lot of good points, but ultimately I'd like to offer the user one simple scheme or another, either index all symlinks, or index no symlinks, configurable as an option. Definitely hiding folders with the hidden or system attribute sets is a good idea. If I look into re-enabling shortcut indexing, I'll look into doing that as well.

thank you

thank you . i love this site

Superb work, Nir.

Just can't test it out yet because you are only providing your nightlies for Windows and Mac ;-)
But take your time: we 'Buntu users are very used to waiting (while others are messing around with gedit 3.14, aptitude will still install v3.10 from years ago ;)