Discussion:
[clug] Downloading binaries from Newsgroups?
Felix Karpfen
2010-01-05 20:46:32 UTC
Permalink
I have always regard this option as a recipe for disaster and made no
effort to explore what newsgroups, that specialise on binaries, have
to offer.

Have I missed something?

This query is inspired by a recent posting that describes a failed
binaries download (see below). And I understand *nothing* of what it
says!

I would welcome being pointed to documentation that would enable me at
least to understand what the poster is talking about.

Comments on the benefits/hazards of downloading binaries from NGs would
be a bonus.

Felix Karpfen

Downloaded Posting
==================

I grabbed what was probably the same .nzb you did, from a binsearch.info
on the subject string in your original post. Said return was oddly
lacking any detail about the multiparts.

I fed the .nzb to Pan 1.33 (from 09SEP09 git checkout [checkout probably
the wrong term for git but I'm new to it]), and it downloaded 1.3 GB in
6,124 files, almost all of which looked like this:

248 Crazy Christmas Lights-2009-12-06-0_copy_1000.tp
4 Crazy Christmas Lights-2009-12-06-0_copy_1000.tp.ERRORS
248 Crazy Christmas Lights-2009-12-06-0_copy_1001.tp
4 Crazy Christmas Lights-2009-12-06-0_copy_1001.tp.ERRORS
248 Crazy Christmas Lights-2009-12-06-0_copy_1002.tp
4 Crazy Christmas Lights-2009-12-06-0_copy_1002.tp.ERRORS

...etc. ad nauseum

The .ERRORS files all look like this:

Warning: Missing everything before part #6745

But hark! Note the size of the single .tp file at the very end of this
mess. Why, a CD-sized file all decoded and everything. What be it?

664388 Crazy Christmas Lights-2009-12-06-0.tp
24 Crazy Christmas Lights-2009-12-06-0.tp.ERRORS
628 Crazy Christmas Lights-2009-12-06-0.tp.nzb

The .ERRORS file for this CD-sized .tp file is full of 379 entries like
this:

ERROR: %Part 2554 missing. Decoded file probably corrupt.

...but Pan decoded what it could find on Easynews.

Next, the google showed me that .tp is a file type for 'MPEG-2 TV
recorded file: File extension is used for MPEG-2 TV recorded file (Use
mpeg-2 compression).'

So I fed the CD-sized .tp to Kaffeine and it plays just fine. It's the
first 14 minutes of an hour-long show about over-the-top xmas
decorations aired on The Learning Channel, featuring setups you've
probably already seen on YouTube and such.

Is this a borked Usenet post? Boy howdy, is it ever! What kind of idjit
would post in that format in the first place? A clueless one, is what
kind ;-)

I've never seen this particular flavor of borkitude before, but it
appears to me that Pan is as uncontaminated by blame for this mess as
the poster is uncontaminated by clues.
--
Felix Karpfen
Public Key 72FDF9DF (DH/DSA)
Nathan O'Sullivan
2010-01-05 21:28:35 UTC
Permalink
As you appear to have found - binaries nowadays are usually downloaded
through the use of an .nzb file, which I believe is basically a listing
of the usenet posts required

Specialized software is available that utilises these nzb files - I am
partial to hellanzb on linux, even though its abandoned.

I would suggest starting by trying that instead of Pan.
Post by Felix Karpfen
I have always regard this option as a recipe for disaster and made no
effort to explore what newsgroups, that specialise on binaries, have
to offer.
Have I missed something?
This query is inspired by a recent posting that describes a failed
binaries download (see below). And I understand *nothing* of what it
says!
I would welcome being pointed to documentation that would enable me at
least to understand what the poster is talking about.
Comments on the benefits/hazards of downloading binaries from NGs would
be a bonus.
Felix Karpfen
Downloaded Posting
==================
I grabbed what was probably the same .nzb you did, from a binsearch.info
on the subject string in your original post. Said return was oddly
lacking any detail about the multiparts.
I fed the .nzb to Pan 1.33 (from 09SEP09 git checkout [checkout probably
the wrong term for git but I'm new to it]), and it downloaded 1.3 GB in
248 Crazy Christmas Lights-2009-12-06-0_copy_1000.tp
4 Crazy Christmas Lights-2009-12-06-0_copy_1000.tp.ERRORS
248 Crazy Christmas Lights-2009-12-06-0_copy_1001.tp
4 Crazy Christmas Lights-2009-12-06-0_copy_1001.tp.ERRORS
248 Crazy Christmas Lights-2009-12-06-0_copy_1002.tp
4 Crazy Christmas Lights-2009-12-06-0_copy_1002.tp.ERRORS
...etc. ad nauseum
Warning: Missing everything before part #6745
But hark! Note the size of the single .tp file at the very end of this
mess. Why, a CD-sized file all decoded and everything. What be it?
664388 Crazy Christmas Lights-2009-12-06-0.tp
24 Crazy Christmas Lights-2009-12-06-0.tp.ERRORS
628 Crazy Christmas Lights-2009-12-06-0.tp.nzb
The .ERRORS file for this CD-sized .tp file is full of 379 entries like
ERROR: %Part 2554 missing. Decoded file probably corrupt.
...but Pan decoded what it could find on Easynews.
Next, the google showed me that .tp is a file type for 'MPEG-2 TV
recorded file: File extension is used for MPEG-2 TV recorded file (Use
mpeg-2 compression).'
So I fed the CD-sized .tp to Kaffeine and it plays just fine. It's the
first 14 minutes of an hour-long show about over-the-top xmas
decorations aired on The Learning Channel, featuring setups you've
probably already seen on YouTube and such.
Is this a borked Usenet post? Boy howdy, is it ever! What kind of idjit
would post in that format in the first place? A clueless one, is what
kind ;-)
I've never seen this particular flavor of borkitude before, but it
appears to me that Pan is as uncontaminated by blame for this mess as
the poster is uncontaminated by clues.
Felix Karpfen
2010-01-06 01:22:17 UTC
Permalink
I would suggest starting by trying that (hellanzb) instead of Pan.
Thank you for the advice.

I note that "hellanzb" can readily be installed from my Debian Lenny
disks.

And, from its web-page, the USING instruction for hellanzb are
engagingly simple.

But I am looking for pointers to a guided tour of the "NZB World".

I have located:

https://www.binsearch.info/faq.php

but find its content to be brief and cryptic.

I would welcome a more leisurely exposition of the topic, unless - as
Steve Jenkin's direct reply hints - I am better off by continuing in my
ignorance.

Felix Karpfen
--
Felix Karpfen
Public Key 72FDF9DF (DH/DSA)
Steve McInerney
2010-01-06 05:30:46 UTC
Permalink
Post by Felix Karpfen
I would suggest starting by trying that (hellanzb) instead of Pan.
Thank you for the advice.
You could also try:
* save all the uuencoded text in the various postings into numerically
sequenced files.
* sed/vi the extraneous gumpf around the ---cut here--- lines in all
those files
* cat the files back together and uudecode the lot

Which was a successful method of extracting binaries from news groups
back in 1990. ;-)


Cheers!
- Steve
[1] I said successful; I didn't describe how painful.
Hal Ashburner
2010-01-06 07:27:44 UTC
Permalink
Post by Steve McInerney
Post by Felix Karpfen
I would suggest starting by trying that (hellanzb) instead of Pan.
Thank you for the advice.
* save all the uuencoded text in the various postings into numerically
sequenced files.
* sed/vi the extraneous gumpf around the ---cut here--- lines in all
those files
* cat the files back together and uudecode the lot
Which was a successful method of extracting binaries from news groups
back in 1990. ;-)
So maybe like me, Felix wants to know more generally about binaries and
newsgroups as a background to work out if he wants to know more and to
enable that if so. Hmm.
In the words of Manuel(+), "I know nothing about the horse!"
And I feel this might be the way to go on this topic. Please if somebody
knows something more do tell Felix and myself so we know too. I tend to
think of binaries on newsgroups as a place of ugliness, evil and general
yuk. Yes I do think this on not much more than pure prejudice.
Why would you distribute a binary over news?
If it were legal or tasteful why not stick it on a normal webserver or
seed a torrent?
I'm not talking about the days of yore but right now in 2010 is there
any good reason to distribute binaries over news? The mutterings I've
heard is that it's used for people to share things that can't be unseen
that I really don't want any part of in any way, shape or form.

Maybe train spotters attach photos or short movies of their favourite
trains on alt.trains.something and this is better than a web forum for
some reason.
I've heard of people using it to get hollywood movies and warez** and so
on as well as the hideous(++) stuff.

So yeah I'm unintentionally trolling badly but do please jump in.

Techinically news works similar to email, you can break up a large file
into smaller pieces then uuencode each piece and attach to a separate
newsgroup post. There is software that automates dowloading all the
relevant newsgroup posts, decodes the attachments and reassembles them
into original file. We should probably read up on news, (RFC* 977
http://www.faqs.org/rfcs/rfc977.html), uuencode to get the basics then
look up the documentation for the software mentioned that automates it
if we want to know more about it all.


So I'm just going to sum it all up like this:
"Felix: Stay away from newsgroup binaries, you don't want that stuff. I
might well be wrong, someone will say so in reply if I am."


The stupid jargon decoding section:
*RFC = Request For Comment. This actually means "Internet Standard" and
is the rules you need to follow to write software that will work with
other people's stuff. For an email program to work it has to do what the
RFC says or the mail won't get anywhere much. mutt, thunderbird, pine,
kmail, evolution, outlook, lotus all have to agree on what email is,
that agreement is meant to be the RFC on the particular aspect of email.

**warez = software that is being distributed and used without a proper
license from the copyright holder. Eg a cracked copy of Windows ME might
be distributed without the proper license as "warez." These are often
infected with malware (viruses, trojan, spying, baby-eating programs
hidden inside them). Windows ME has been considered by some as malware
without additional infection.

(+) Fawlty Towers
(++) hideous = even more disturbing than the prime minister eating his
earwax.
Felix Karpfen
2010-01-06 20:42:30 UTC
Permalink
Post by Hal Ashburner
I've heard of people using it to get hollywood movies and warez** and so
on as well as the hideous(++) stuff.
I took a brief look at what was on offer at

https://www.binsearch.info

The English language stuff, that I checked, was singularly uninspiring
(some of it might have aroused my curiosity 60 years ago; but, alas, I am
well and truly "over the hill").

But speakers of foreign languages fare marginally better.

Here is an entry for the DVDs of "The Lord of the Rings - Part 2":

View all 14 posts by JBinUp.com <JBinUp at JBinUp.local> [Multiple posts by
same poster hidden]

<<www.illuminatenboard.org>>Der.Herr.der.Ringe.Die.zwei.Tuerme.German.WS.DL.DVD9.untouched.DGB>
(1/176) "dHdR.Die.zwei.Tuerme.DVD9.untouched.DGB.par2" yEnc (1/1)
collection size: 8.32 GB, parts available: 22462 / 22462
- 164 rar files
- 12 par2 files

At present, I have *no* idea what that says; and I do not intend to
embark on a 8.32 GB download in order to find out.
Post by Hal Ashburner
So I'm just going to sum it all up like this: "Felix: Stay away from
newsgroup binaries, you don't want that stuff. I might well be wrong,
someone will say so in reply if I am."
That has been my approach up till now. And it has served me well so
far.

Thank you for taking the trouble to write such a detailed reply.

Felix
--
Felix Karpfen
Public Key 72FDF9DF (DH/DSA)
Adam Baxter
2010-01-07 23:10:16 UTC
Permalink
I use a python based program called SABNZDBd to grab Usenet binaries.

It supports ".par2" parity files which can be used to repair damaged
posts such as the one mentioned above. PAR2 has become a defacto
standard, and as such, most posts will come with a .par2 "set" that
can be used to repair anywhere from 10% to 30% of the original file,
at the cost of extra downloads.

Most smart Usenet clients recognise par2 sets, and only download them if needed.

Die.zwei.Tuerme.DVD9.untouched.DGB.par2

Is the German name for The Two Towers movie ;)

DVD9.untouched means it is a direct copy of the original DVD with
additional compression, as you can probably see by the filesize.

This is getting far, far offtopic for this list.

--Adam
Post by Felix Karpfen
Post by Hal Ashburner
I've heard of people using it to get hollywood movies and warez** and so
on as well as the hideous(++) stuff.
I took a brief look at what was on offer at
https://www.binsearch.info
The English language stuff, that I checked, was singularly uninspiring
(some of it might have aroused my curiosity 60 years ago; but, alas, I am
well and truly "over the hill").
But speakers of foreign languages fare marginally better.
View all 14 posts by JBinUp.com <JBinUp at JBinUp.local> [Multiple posts by
same poster hidden]
<<www.illuminatenboard.org>>Der.Herr.der.Ringe.Die.zwei.Tuerme.German.WS.DL.DVD9.untouched.DGB>
(1/176) "dHdR.Die.zwei.Tuerme.DVD9.untouched.DGB.par2" yEnc (1/1)
collection size: 8.32 GB, parts available: 22462 / 22462
- 164 rar files
- 12 par2 files
At present, I have *no* idea what that says; and I do not intend to
embark on a 8.32 GB download in order to find out.
Post by Hal Ashburner
So I'm just going to sum it all up like this: "Felix: Stay away from
newsgroup binaries, you don't want that stuff. I might well be wrong,
someone will say so in reply if I am."
That has been my approach up till now. ?And it has served me well so
far.
Thank you for taking the trouble to write such a detailed reply.
Felix
--
Felix Karpfen
Public Key 72FDF9DF (DH/DSA)
--
linux mailing list
linux at lists.samba.org
https://lists.samba.org/mailman/listinfo/linux
Loading...