Big downloads dropping / disconnected, speed dropping intermittently...?!

Crazy Serb

Well-Known Member
#1
I've got an issue similar to the one described here:
https://www.litespeedtech.com/support/forum/threads/huge-download-speed-drop.2504/

A client is offering few hundred large files for download to their customers, and some of those are around 1GB in size too.

The problem that neither me nor my webhost can figure out at this point (we've been going back & forth on all possible solutions with zero success) is two-fold:
1) why does the speed drop intermittently for larger downloads.
and
2) why do dozens of these downloads drop before they even reach 50%?

As far as #1 goes, the download starts off fast, at 2MB/s, then within 10-15 seconds goes to super slow speed above, and then over the next 5-10 minutes it goes fluctuating between the user's top allowable speed (mine is 12MB/s) and super slow one above, so it will hit up to 7MB/s on my end but then immediately go down to 15-20KB/s.
I've got a steady 100Mb line on my end, so it's definitely not me being the slow link in this chain - see the speedtest below at the exact same time when the speed of the download was going from 40-50KB/s to around 300-400KB/s:
http://prntscr.com/etu4yo


As far as #2 goes, it seems that for those with slower connections (or for those cases when the download speed drops down super low, like described above) the downloads break for whatever reason, disconnect and drop, and result in incomplete and corrupted downloads.
We have tried testing this from another server on the same network limiting the transfer to 100K/s, and could not get past 52% for the life of us (domain changed for client's privacy purposes):

curl --limit-rate 250K http://www.domain.com/downloads/elearning/TJPart3.zip -o TJPart3.zip
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
52 871M 52 460M 0 0 249k 0 0:59:28 0:31:26 0:28:02 308k
curl: (18) transfer closed with 430580540 bytes remaining to read

This is the best one we could get - others would cut out after 9-10% or so.

And there is no reason for any of this - the source server has been up for days, Litespeed has been up for days too, usage is down to 5-10% of the resources available at peak times, nothing funky in the logs, the tech guys have combed thru everything and can't find a single reason why this is happening.

Ideas?!

Using latest Litespeed 5.1.14 on CENTOS 7.3 x86_64 and cPanel & WHM 64.0 (build 9)
 

mistwang

LiteSpeed Staff
#2
For #1, you need to check if there is any bandwidth throttling configuration, using mod_bw style configuration, it has special directive for limiting large file download. Also, make sure your server does not set
https://www.litespeedtech.com/docs/webserver/config/tuning#sndBufSize
it should be "unset"

For #2, check if the server has been restarted during the download, if it cut off at random size, it could be the server being restarted too frequently.

If file is served dynamically through PHP, it is recommended to change to using internal redirect.
Last resort is to turn on debug logging, and find out what happened exactly from the log file.
 

Crazy Serb

Well-Known Member
#3
Well, Send Buffer Size was set to 0, so I removed that and set it to empty/not set.

Where would I look for bandwidth throttling configuration? I am 99% certain that is not even in place or enabled, as I'd know (I'd probably be the one to put it in place, if anyone).

No, the server hasn't been restarted at all - when I ran the uptime command on lsphp it said the last time it was restarted was 2 days ago. And we've ruled that one out already. Heck, when I do a graceful server restart that slow speed test with curl above still keeps going... oh wait, never mind, it just got cut off a few minutes later.

And no, these files are not served dynamically in any way - direct links only.

As far as debug logging goes, I've turned that on and set it to LOW and by looking at it after a few minutes I have absolutely no idea what any of that means or what to look for as it doesn't reference anything that makes any sense (file names, etc). I can track things by IP, but it records every single network packet it seems...
 
Last edited:

Crazy Serb

Well-Known Member
#4
This is what the debug log is showing:

error_log.txt(212831): 2017-04-08 00:05:44.725 [DEBUG] add Client:162.216.6.148
error_log.txt(212832): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702] concurrent conn: 1
error_log.txt(212833): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702] HttpSession::eek:nReadEx(), state: 1!
error_log.txt(212834): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702] readToHeaderBuf().
error_log.txt(212835): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702] Read from client: 224
error_log.txt(212836): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702] read 224 bytes to header buffer
error_log.txt(212837): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702] processHeader() return 0, header state: 3.
error_log.txt(212838): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702] processNewReq(), request header buffer size: 228, header used: 228, processed: 228.
error_log.txt(212839): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink::suspendRead()...
error_log.txt(212840): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] [CACHE] Early lookup, cache config: 0xda0a70, private: 0, public: 0
error_log.txt(212841): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] processContext() returned 0.
error_log.txt(212842): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] processContextPath() return 0
error_log.txt(212843): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] readyCacheData() return 0
error_log.txt(212844): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] endResponseInternal()
error_log.txt(212845): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] HttpSession::flush()!
error_log.txt(212846): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] flushBody() return 1
error_log.txt(212850): 2017-04-08 00:05:44.725 [DEBUG] [162.216.6.148:2702#APVH_domain.com] HttpSession::flush()!
error_log.txt(212851): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8417
error_log.txt(212852): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212853): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212854): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212855): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212856): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212857): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212858): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212859): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212860): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 5687
error_log.txt(212861): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] flushBody() return 1
error_log.txt(212862): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink::continueWrite()...
error_log.txt(212863): 2017-04-08 00:05:44.740 [DEBUG] [162.216.6.148:2702#APVH_domain.com] write resumed!
error_log.txt(212864): 2017-04-08 00:05:44.741 [DEBUG] [162.216.6.148:2702#APVH_domain.com] HttpSession::flush()!
error_log.txt(212865): 2017-04-08 00:05:44.741 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(212866): 2017-04-08 00:05:44.741 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192

..... that block repeats a bunch of time.... and then at the end of it all, when the download was cut off:

error_log.txt(736372): 2017-04-08 00:13:25.446 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(736373): 2017-04-08 00:13:25.446 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(736374): 2017-04-08 00:13:25.446 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 8192
error_log.txt(736375): 2017-04-08 00:13:25.446 [DEBUG] [162.216.6.148:2702#APVH_domain.com] NtwkIOLink written: 5936
error_log.txt(736376): 2017-04-08 00:13:25.446 [DEBUG] [162.216.6.148:2702#APVH_domain.com] flushBody() return 1


And that's it.

It just stopped there, for no apparent reason (the file download cut off at 10% completion)

At least not to me.

Any idea what any of that means?
 

Crazy Serb

Well-Known Member
#5
and here is the FULL log - again, the simulated download at 200K/s cut off at 29%:


\error_log.txt(520196): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269049991
\error_log.txt(520197): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(520198): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269058183
\error_log.txt(520199): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(520200): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269066375
\error_log.txt(520201): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(520202): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269074567
\error_log.txt(520203): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(520204): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269082759
\error_log.txt(520205): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(520206): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269090951
\error_log.txt(520207): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(520208): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269099143
\error_log.txt(520209): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(520210): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269107335
\error_log.txt(520211): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(520212): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269115527
\error_log.txt(520213): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 5184
\error_log.txt(520214): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269120711
\error_log.txt(520215): 2017-04-08 01:03:24.800 [DEBUG] [162.216.6.148:3969#APVH_domain.com] flushBody() return 1
\error_log.txt(523709): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink::handleEvents() events=4!
\error_log.txt(523710): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] HttpSession::flush()!
\error_log.txt(523711): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523712): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269128903
\error_log.txt(523713): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
.......
\error_log.txt(523796): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269472967
\error_log.txt(523797): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523798): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269481159
\error_log.txt(523799): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 2873
\error_log.txt(523800): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269484032
\error_log.txt(523801): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] submit AIO request, id: 9406E823, fd: 106, offset: 269484032, size: 1048576.
\error_log.txt(523802): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink::suspendWrite()...
\error_log.txt(523803): 2017-04-08 01:03:26.841 [DEBUG] [162.216.6.148:3969#APVH_domain.com] flushBody() return 1
\error_log.txt(523804): 2017-04-08 01:03:26.842 [DEBUG] [162.216.6.148:3969#APVH_domain.com] AIO request finished, id: 9406E823, failed: 0, state: 10, wantWrite: 0
\error_log.txt(523805): 2017-04-08 01:03:26.842 [DEBUG] [162.216.6.148:3969#APVH_domain.com] HttpSession::flush()!
\error_log.txt(523806): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523807): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269492224
\error_log.txt(523808): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523809): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269500416
\error_log.txt(523810): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523811): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269508608
\error_log.txt(523812): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523813): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269516800
\error_log.txt(523814): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523815): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269524992
\error_log.txt(523816): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523817): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269533184
\error_log.txt(523818): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523819): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269541376
\error_log.txt(523820): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523821): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269549568
\error_log.txt(523822): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523823): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269557760
\error_log.txt(523824): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523825): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269565952
\error_log.txt(523826): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523827): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269574144
\error_log.txt(523828): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523829): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269582336
\error_log.txt(523830): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523831): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269590528
\error_log.txt(523832): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523833): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269598720
\error_log.txt(523834): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523835): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269606912
\error_log.txt(523836): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523837): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269615104
\error_log.txt(523838): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523839): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269623296
\error_log.txt(523840): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523841): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269631488
\error_log.txt(523842): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 8192
\error_log.txt(523843): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269639680
\error_log.txt(523844): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink written: 2311
\error_log.txt(523845): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] writeRespBody() sent: 269641991
\error_log.txt(523846): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] flushBody() return 1
\error_log.txt(523847): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] NtwkIOLink::continueWrite()...
\error_log.txt(523848): 2017-04-08 01:03:26.843 [DEBUG] [162.216.6.148:3969#APVH_domain.com] write resumed!
 

mistwang

LiteSpeed Staff
#7
Please change debug level to HIGH and do the same test. just track related log entries with the IP and port combination.
You should see log entries at the end that connection was closed no matter what is the reason.
The current log shows it is still trying to send remaining file, no close.
 
Top