Discussion:
Are text-only peers being too restrictive at <256k article size?
(too old to reply)
Jesse Rehmer
2022-06-18 18:50:26 UTC
Permalink
I have followed what seems like a general consensus among text-only
peers to limit article size at 128/256k when setting up peering
arrangements. However, I've begun a project to back fill my spool using
tools like pullnews and suck from a large spool.

I never changed INN's max. article size from 1000000. Out of curiosity,
after filling a few Big8 hierarchies from the upstream spool, I searched
the spool for articles >500k and to my surprise I've been missing out on
a lot of valid and (in my opinion) valuable articles.

I'm finding a ton of FAQs, some checkgroup messages, lots of source code
discussions, etc. that I've never seen before. I read a thread from a
few years ago where Russ Allbery mentions by not accepting articles up
to 1MB you may be missing out on quite a bit. From a very quick scan of
what I've found larger than 500k, he's right.

Seems I've been a bit hasty requesting peering parameters around a 256k
limit. I assume this common peering parameter was adopted to limit the
chances of misplaced binary articles coming through a feed, but am
wondering how "harmful" it is to have such low limits when I'm seeing a
high rate of valid communication that I believe many news admins would
want to carry?

Admittedly, it usually takes trial and error with peers using Cyclone
(and quite a few with Diablo who disable its internal article type
check) to properly configure their feeds not to leak binaries to
text-only peers, but I'm not seeing much of that these days.

Cheers,

Jesse
The Doctor
2022-06-18 22:21:03 UTC
Permalink
Post by Jesse Rehmer
I have followed what seems like a general consensus among text-only
peers to limit article size at 128/256k when setting up peering
arrangements. However, I've begun a project to back fill my spool using
tools like pullnews and suck from a large spool.
I never changed INN's max. article size from 1000000. Out of curiosity,
after filling a few Big8 hierarchies from the upstream spool, I searched
the spool for articles >500k and to my surprise I've been missing out on
a lot of valid and (in my opinion) valuable articles.
I'm finding a ton of FAQs, some checkgroup messages, lots of source code
discussions, etc. that I've never seen before. I read a thread from a
few years ago where Russ Allbery mentions by not accepting articles up
to 1MB you may be missing out on quite a bit. From a very quick scan of
what I've found larger than 500k, he's right.
Seems I've been a bit hasty requesting peering parameters around a 256k
limit. I assume this common peering parameter was adopted to limit the
chances of misplaced binary articles coming through a feed, but am
wondering how "harmful" it is to have such low limits when I'm seeing a
high rate of valid communication that I believe many news admins would
want to carry?
Admittedly, it usually takes trial and error with peers using Cyclone
(and quite a few with Diablo who disable its internal article type
check) to properly configure their feeds not to leak binaries to
text-only peers, but I'm not seeing much of that these days.
Cheers,
Jesse
Try

:Tm,<512000 .
--
Member - Liberal International This is doctor@@nl2k.ab.ca Ici doctor@@nl2k.ab.ca
Yahweh, Queen & country!Never Satan President Republic!Beware AntiChrist rising!
Look at Psalms 14 and 53 on Atheism https://www.empire.kred/ROOTNK?t=94a1f39b
Denial of our faults condemns us to their permanence. -unknown Beware https://mindspring.com
Richard Kettlewell
2022-06-19 08:31:22 UTC
Permalink
Post by Jesse Rehmer
I have followed what seems like a general consensus among text-only
peers to limit article size at 128/256k when setting up peering
arrangements. However, I've begun a project to back fill my spool
using tools like pullnews and suck from a large spool.
I never changed INN's max. article size from 1000000. Out of
curiosity, after filling a few Big8 hierarchies from the upstream
spool, I searched the spool for articles >500k and to my surprise I've
been missing out on a lot of valid and (in my opinion) valuable
articles.
I track what the largest articles in my spool are and I don’t think I’ve
seen anything over 500Kbyte - locally at least, the largest tend to be
in the 100-120Kbyte region.

A lot of the larger articles are spam - 8 of a recent top 10 were copies
of the same article, for instance. For this reason I’ve now adopted a
much lower EMP threshold for large articles.
--
https://www.greenend.org.uk/rjk/
Thomas Hochstein
2022-06-20 18:39:32 UTC
Permalink
Post by Jesse Rehmer
I have followed what seems like a general consensus among text-only
peers to limit article size at 128/256k when setting up peering
arrangements.
That seems rather small.
Post by Jesse Rehmer
I'm finding a ton of FAQs, some checkgroup messages, lots of source code
discussions, etc. that I've never seen before.
Yep.
Post by Jesse Rehmer
I assume this common peering parameter was adopted to limit the
chances of misplaced binary articles coming through a feed, but am
wondering how "harmful" it is to have such low limits when I'm seeing a
high rate of valid communication that I believe many news admins would
want to carry?
It seems wrong to filter on size if you really want to filter on type.
Cleanfeed can block binaries in non-binary groups, regardless of size;
then you can accept text-only messages regardless of size, too.

-thh
noel
2022-06-21 09:48:30 UTC
Permalink
Post by Thomas Hochstein
Post by Jesse Rehmer
I have followed what seems like a general consensus among text-only
peers to limit article size at 128/256k when setting up peering
arrangements.
That seems rather small.
If you've got half megabyte text posts, they are not text but binaries
hiding as text, either that or its a year long thread and everyone is too
clueless to trim their damn posts, sure the odd FAQ post might get caught
up in it, but its gonna be rare.
b***@ripco.com
2022-06-21 15:43:07 UTC
Permalink
Post by noel
Post by Thomas Hochstein
Post by Jesse Rehmer
I have followed what seems like a general consensus among text-only
peers to limit article size at 128/256k when setting up peering
arrangements.
That seems rather small.
If you've got half megabyte text posts, they are not text but binaries
hiding as text, either that or its a year long thread and everyone is too
clueless to trim their damn posts, sure the odd FAQ post might get caught
up in it, but its gonna be rare.
I'm in total agreement.

Even at a 250k limit, anyone still on a 80x24 character screen (like me)
would be going through 130 pages.

The largest post I can find on my spool is just a bit over 210k and it's
just a faq that probably no one reads.

I really would like to know what posts Jesse found that would be of any
interest to anyone in the 500k to 1m range.

-bruce
***@ripco.com
Nigel Reed
2022-06-22 08:04:37 UTC
Permalink
On Sat, 18 Jun 2022 13:50:26 -0500
Post by Jesse Rehmer
I have followed what seems like a general consensus among text-only
peers to limit article size at 128/256k when setting up peering
arrangements. However, I've begun a project to back fill my spool
using tools like pullnews and suck from a large spool.
I never changed INN's max. article size from 1000000. Out of
curiosity, after filling a few Big8 hierarchies from the upstream
spool, I searched the spool for articles >500k and to my surprise
I've been missing out on a lot of valid and (in my opinion) valuable
articles.
Four of my ten peers (I'm always open to more) have requested
restrictions from 131972 to 512000. I have no restriction here.
--
End Of The Line BBS - Plano, TX
telnet endofthelinebbs.com 23
Borg
2022-07-17 11:33:49 UTC
Permalink
Post by Jesse Rehmer
I have followed what seems like a general consensus among text-only
peers to limit article size at 128/256k when setting up peering
arrangements.
[...]

In the past I have used servers that limited posts to 32k and 64k. If I
were running a server today I would probably limit new posts to 64k. The
way I see it Usenet is for discussion, not encyclopedia publication.
Just my personal opinion ...

If someone is publishing a huge FAQ or something they think is really
important they can divide it into a series of posts. The work to create
such a large post is far in excess of the minor ten-second hassle of of
'head / tail -c 65536' or copypasta to two or three posts.

--

Borg

Loading...