openLTE / Discussion / General Discussion: OpenLTE B200 / Clock Drift?

hyerisf - 2017-04-25

Hi,

I'm trying to get OpenLTE working with a USRP B200. It currently works very intermittently: sometimes, it works and I can see it on UE, most of the time, I can see nothing. Settings-wise, there's nothing controversial:

band 5 (or whatever)
bandwidth 5
tx_gain 85
rx_gain 35
clock_source internal

I believe the host is powerful enough (Thinkpad x240AL), and there's no O/L/U complaints from UHD. The "Asking for clock rate" and "Actually got clock rate" output matches up exactly.

Using a bladeRF to test also produces the same result (generally, I can't see anything on UE).

The only thing I can think of is clock isues. Do I need a dedicated clock source / GPSDO to run OpenLTE? I've done a bit of background reading, and it appears that OpenLTE works without a GPSDO / any external clock source (e.g. https://lh3.googleusercontent.com/proxy/0k0C25c8OzATMW8sPURKLM3EjsxjKCAuliI9f7xrN4z4axGCHTWyeRdPA63Rt-0dA1O0PJ574iaXC9BAtD7BjAZAj3sYmTUv1w0ye3bxJexIaGngo2CSnRW7UNtW6GgmKWlQ-pR3shQ-oDVLp9c=w530-h347-p - nothing connected externally, nothing visible on-board as far as I can tell)

Am I missing something? Is there some other way to debug (and correct?) clock inaccuracies, if that's the case?

Thanks!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-13

Hello Friend!

I've having the exact same issue, I'm using a Lenovo X250 @ 2.6Ghz with 2 cores, I think the issues we're encounting is due to our laptop's power.

I think the X230 @ 2.9Ghz maybe better?

Did you get this resolved?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-13

The only other thing I can think of is the B200 actually dosn't work and its listed there as supported a troll?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-13

The only other thing I can think of is the USB Bus? Being terrible? Maybe? This issue drives me crazy!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

If I set the bandwidth to 1.4 I can actually see the network, however I'm unable to attach to it (Correct Ki, IMSI and IMEI added)

I'm almost certian at this point the issue is related to clock and the need for GPSDO.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

The BladeRF comes very well calibrated and has no need for external clock. I find however the TCXO needs a minute or so after cold power up before the clock is stable.

You should confirm your device's calibration against a known GSM network using the kal tool.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Cheers for the reply Jeremy, I've confirmed my clock is off by -2Khz on my B200 and -67Khz on my BladeRF, currently running kal on BladeRF to get a better DAC Trim.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

That 67khz is a suspicious number.
It's 1/4 of the GSM symbol rate which means the kal tool may not be working properly, 67kHz assuming GSM1800 network is over 35 ppm which does not sound right at all.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

I would step back a moment.
There's no convincing evidence that this is a clock problem.
These devices are very well calibrated out of the box, so if you are going to play around with the trim dac at least save your old value.

Now, given what you have told me, that you do see the network at 1.4MHz, and at no other bandwidths, this suggests it is more likely to be the system struggling to keep up. What does the debug console say? Do you get lots of messages about skipping subframes?

One recommendation I can make is to disable hyperthreading. I have turned on the PMC counters in Windows and can confirm if the O/S schedules code on the the other hyperthread of the core the radio thread is running on, this can cause the FFT processing to double or even triple in length, causing stuttering.

Last edit: Jeremy Quirke 2017-08-14

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Indeed they're very well calibrated out of the box however I purchased this BladeRF over a year ago and it's been sitting in a box, I believe it may be a CLK issue.

I've followed OpenAirInterface's guide on setting up the Ubuntu machine, disabling P-States, Disabling Intel Power Managing etc, there however seem to be no options to disable Hyperthreading?

I'm also using the lowlatency kernel, 2 cores @ 2.6 seems to be more than enough? No?

Thanks for your response!

Last edit: Dylanger Daly 2017-08-14

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

One core runs the dedicated radio thread so everything else will go on the first core.

Should be enough.

Again, are you receiving warnings when you connect to the debug port about skipping frames?

Be cautious on your platform that the code below isn't affinitizing the other message processing threads to run on the hyperthreaded sibling of the radio core (I'm not familiar with how Linux reports _SC_NPROCESSORS_ONLN)

pthread_getaffinity_np(msgq->rx_thread, sizeof(af_mask), &af_mask); CPU_CLR(sysconf(_SC_NPROCESSORS_ONLN)-1, &af_mask); pthread_setaffinity_np(msgq->rx_thread, sizeof(af_mask), &af_mask);

Last edit: Jeremy Quirke 2017-08-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Sorry yes, literally crazy spammed with them, like 100s every second.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

OK so you have a performance problem, not a clock problem.

Again, double check that the radio thread is running real time and affinitized to one hyperthreaded core.

Make sure the other threads are affinitized to every other core and not the hyperthreaded sibiling of the radio core.

Also check of course that other processes aren't chewing CPU etc (the obvious).

There's one more obvious question - is the USB port operating at SuperSpeed

(bladerf-cli -i, info command)

Last edit: Jeremy Quirke 2017-08-14

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Hmm, these are the errors I get:

08/14/2017 17:19:37.112977 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.113366 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.113507 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.113639 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.113769 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.113944 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.114062 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.114094 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.114122 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.114458 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.114492 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.114520 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.114696 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_phy circular buffer empty on receive
08/14/2017 17:19:37.115120 info mac LTE_fdd_enb_mac.cc 407 MAC_dl_tti - PHY_dl_tti != 2 (1), skipping 0 subframes
08/14/2017 17:19:37.115289 info mac LTE_fdd_enb_mac.cc 407 MAC_dl_tti - PHY_dl_tti != 2 (1), skipping 0 subframes
08/14/2017 17:19:37.115328 warning msgq LTE_fdd_enb_msgq.cc 234 phy_to_mac circular buffer empty on receive
08/14/2017 17:19:37.115367 warning msgq LTE_fdd_enb_msgq.cc 234 phy_to_mac circular buffer empty on receive
08/14/2017 17:19:37.116244 info mac LTE_fdd_enb_mac.cc 407 MAC_dl_tti - PHY_dl_tti != 2 (0), skipping 0 subframes
08/14/2017 17:19:37.116784 error phy LTE_fdd_enb_phy.cc 709 PDSCH current_tti from MAC (7981) does not match PHY (8001)
08/14/2017 17:19:37.117119 info mac LTE_fdd_enb_mac.cc 407 MAC_dl_tti - PHY_dl_tti != 2 (-1), skipping 4 subframes
08/14/2017 17:19:37.117827 error phy LTE_fdd_enb_phy.cc 709 PDSCH current_tti from MAC (7982) does not match PHY (8002)
08/14/2017 17:19:37.118742 error phy LTE_fdd_enb_phy.cc 709 PDSCH current_tti from MAC (7983) does not match PHY (8003)
08/14/2017 17:19:37.120008 error phy LTE_fdd_enb_phy.cc 709 PDSCH current_tti from MAC (7984) does not match PHY (8004)
08/14/2017 17:19:37.123144 warning msgq LTE_fdd_enb_msgq.cc 234 mac_to_timer circular buffer empty on receive
08/14/2017 17:19:37.123303 error phy LTE_fdd_enb_phy.cc 709 PDSCH current_tti from MAC (7997) does not match PHY (8007)
08/14/2017 17:19:37.123683 error phy LTE_fdd_enb_phy.cc 468 Late DL subframe from MAC:8007, PHY is currently on 8008
08/14/2017 17:19:37.123977 error phy LTE_fdd_enb_phy.cc 503 Late UL subframe from MAC:8004, PHY is currently on 8005
08/14/2017 17:19:37.124128 warning msgq LTE_fdd_enb_msgq.cc 234 phy_to_mac circular buffer empty on receive
08/14/2017 17:19:37.124186 error phy LTE_fdd_enb_phy.cc 709 PDSCH current_tti from MAC (7998) does not match PHY (8008)
08/14/2017 17:19:37.124422 error phy LTE_fdd_enb_phy.cc 468 Late DL subframe from MAC:8008, PHY is currently on 8009
08/14/2017 17:19:37.124645 error phy LTE_fdd_enb_phy.cc 503 Late UL subframe from MAC:8005, PHY is currently on 8006
08/14/2017 17:19:37.124832 error phy LTE_fdd_enb_phy.cc 709 PDSCH current_tti from MAC (7999) does not match PHY (8009)
08/14/2017 17:19:37.125117 error phy LTE_fdd_enb_phy.cc 503 Late UL subframe from MAC:8006, PHY is currently on 8007

{Loops Here}

Yeah she's on SuperSpeed

Here's OpenLTE Config:
System Configuration Parameters
Read parameters using read <param> format
Set parameters using write <param> <value> format
Commands:
start - Constructs the system information and starts the eNB
stop - Stops the eNB
shutdown - Stops the eNB and exits
construct_si - Constructs the new system information
help - Prints this screen
add_user imsi=<imsi> imei=<imei> k=<k> - Adds a user to the HSS (<imsi> and <imei> are 15 decimal digits, and <k> is 32 hex digits)
del_user imsi=<imsi> - Deletes a user from the HSS
print_users - Prints all the users in the HSS
print_registered_users - Prints all the users currently registered
Radio Parameters:
available_radios: (read-only)
0: no_rf
1: bladerf-ID
selected_radio_name (read-only) = bladerf-ID
selected_radio_idx = 1
clock_source = internal
System Parameters:
band = 5
bandwidth = 5
cell_id = 1
debug_level = radio phy mac rlc pdcp rrc mme gw user rb timer iface msgq
debug_type = error warning info debug
dl_center_freq = 869700000
dl_earfcn = 2407
dns_addr = C0A80101
enable_pcap = 0
ip_addr_start = C0A80102
mac_direct_to_ue = 0
mcc = 001
mnc = 01
n_ant = 1
n_id_cell = 0
p0_nominal_pucch = -96
p0_nominal_pusch = -70
phy_direct_to_ue = 0
q_hyst = 0
q_rx_lev_min = -140
rx_gain = 30
search_win_size = 0
sib3_present = 0
sib4_present = 0
sib5_present = 0
sib6_present = 0
sib7_present = 0
sib8_present = 0
tracking_area_code = 1
tx_gain = 30
ul_center_freq = 824700000
ul_earfcn = 20407
use_cnfg_file = 0
use_user_file = 0

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Top Output:
Tasks: 222 total, 2 running, 220 sleeping, 0 stopped, 0 zombie
%Cpu(s): 21.3 us, 7.1 sy, 0.0 ni, 71.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 7888280 total, 6223636 free, 901288 used, 763356 buff/cache
KiB Swap: 8073212 total, 8073212 free, 0 used. 6555636 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2384 root 20 0 1909604 138340 15404 S 57.8 1.8 5:56.41 LTE_fdd_enodeb

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

OK,

We need to eliminate your problem. Here are several possibilities

CPU. The processor itself is fast enough to run 5MHz carrier, but either
1.1 there is lots of interrupt/bottom half processing going on
1.2 you are suffering lots of L2/L3 cache thrashing due to stuff running on the hyperthread/interrupts/bottom halves, this can hurt the downlink IFFT time bigly :P

USB controller can't keep up. I would look here first. Follow this process to identify whether the controller can keep up

https://github.com/Nuand/bladeRF/wiki/Debugging-dropped-samples-and-identifying-achievable-sample-rates

Edit: the above process tests the RX path only. If this sucessfully passes @ 7.68MHz, then set the GPIO to digital loopback, and then use tx/rx commands in bladerf-cli to verify the transmitted file comes back exactly the same (md5sum them).

Good luck!

Last edit: Jeremy Quirke 2017-08-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Thanks heaps for this info!

So I successfully disabled hyperthreading, the OS only sees one core running at its max freq,

I've also run the 'gap' test on the USB Bus as per the link above and it came back 'No Gaps', was there more I had to test for the USB Bus?

00:1d.0 USB controller: Intel Corporation Wildcat Point-LP USB EHCI Controller (rev 03) (prog-if 20 [EHCI])
Subsystem: Lenovo Wildcat Point-LP USB EHCI Controller
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort-="">SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 23
Region 0: Memory at f123d000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Debug port: BAR=1 offset=00a0
Capabilities: [98] PCI Advanced Features
AFCap: TP+ FLR+
AFCtrl: FLR-
AFStatus: TP-
Kernel driver in use: ehci-pci

This is the controller I'm using, its really weird, I still get these errors quite hard, they spam super fast.

I think it could potentially be the USB Bus with this laptop (X250) because I get the exact same issues on my USRP B200, same errors being spammed on debug.

Would I be able to grab a recommendation for a laptop? I was thinking the Lenovo X230 i7?

Thanks you so much for helping me with these issues, I really appreciate it!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

no problem.

OK yes, we need to clarify what you tested.

Test A

The instructions in the link I provided are testing only the RX path without any TX.
This is the first step to pass.

I would run the test with n=250M or similar at sample rates 1.92M, 3.84M, and 7.68M and 15.36M and even 30.72M. 15.36M will allow for a 5MHz carrier (2 directions).

You should be using an SSD for this test as magnetic drive may be the limiting factor. If you don't have an SSD create a RAM disk of 1-2GB.

Once you confirm no gaps @ 15.36M proceed to the next test.

Test B

Next test is to configure the bladeRF @ 3.84M and 7.68M and use digital loopback.
You will then transmit the gap-less file you received in Test A and receive it back using bladerf-cli
use tx config, rx config, tx start and rx start commands to do this.

Then you will gap-check the received file and make sure it ok

If there are no differences the USB bus is OK.

Edit: Wait, do you mean you only have a single processor showing?
pleas show
cat /proc/cpuinfo

Last edit: Jeremy Quirke 2017-08-14
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Awesome so Test A was successful (SSD HDD):
bladeRF> set samplerate rx 15.36M

Setting RX sample rate - req: 15360000 0/1Hz, actual: 15360000 0/1Hz

bladeRF> set gpio 0x257

GPIO: 0x00000257
LMS Enable: Enabled
LMS RX Enable: Enabled
LMS TX Enable: Enabled
TX Band: Low Band (300M - 1.5GHz)
RX Band: Low Band (300M - 1.5GHz)
RX Source: Internal 32-bit counter

bladeRF> rx config file=/dev/shm/samples_15.36msps.bin n=15.36M
bladeRF> rx start
bladeRF> rx

State: Idle
Last error: None
File: /dev/shm/samples_15.36msps.bin
File format: SC16 Q11, Binary
Samples: 16106127
Buffers: 32
Samples per buffer: 32768
Transfers: 16
Timeout (ms): 1000

bladeRF> x
➜ ~ ./blade_gaps.py /dev/shm/samples_15.36msps.bin
Number of gaps:0

Would you be able to elaborate a little more on how to tx and rx the samples_15.36msps.bin file?

Also when you say digital loopback do you mean literally use a pigtail and hook RX to TX? Or is this a GPIO Config thing?

CPU Info:
➜ ~ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 61
model name : Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz
stepping : 4
microcode : 0x25
cpu MHz : 2593.773
cache size : 4096 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 20
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt dtherm arat pln pts
bugs :
bogomips : 5187.54
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:

Cheers!

Last edit: Dylanger Daly 2017-08-14

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

tx config file=samples_15.36msps repeat=0
rx config file=samples_15.36msps.check n=76.8M
set loopback firmware
tx start
rx start

now run the python gaps on samples_15.36msps.check

for some reason I can't run both TX and RX on windows
Maybe you can

If that doesnt work, no matter.
I think our issue is CPU after all.

1 core may not be enough.

I have core i7-4710 with 4xcore 8xHT, I can run fine with no skipping even with web browsing etc @ 5MHz LTE.

Normally I run radio thread on logical processor 7, and everything else affinitized to logical processors 0-5 inclusive (avoid core 6 - I can't disable HT on my laptop).

But if I affinitize the whole process to one core, to simulate your scenario, it falls apart.

So I guess there's your answer.

In you case turning on the HT is no good either as that is not going to help.

Last edit: Jeremy Quirke 2017-08-14

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

On the upside, this experiment helped me reproduce a bug in the bladerf interface. I have started a new thread with the diff and bugfix.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Cheers Jeremy!

I actually got hit with 5 gaps during this test:
➜ ~ ./blade_gaps.py /dev/shm/samples_15.36msps.check
[16106127] = 0, Expected 16106127, Gap = 16106127
[32212254] = 0, Expected 16106127, Gap = 16106127
[48318381] = 0, Expected 16106127, Gap = 16106127
[64424508] = 0, Expected 16106127, Gap = 16106127
[80530635] = 0, Expected 16106127, Gap = 16106127
Number of gaps:5

Does this mean CPU or USB Bus?

Awesome! Squashing Bugs!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

The pattern is too regular for it to be real gaps, so, no, I would say no issue with USB for you.

Your issue is CPU, single core will not cut the mustard unfortunately for you.
You might have some luck @ 1.4MHz bandwidth instead.

Last edit: Jeremy Quirke 2017-08-14

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Dylanger Daly - 2017-08-14

Ah awesome, I've actually tried 1.4Mhz, it works, its discoverable on UEs, however I can't attach and on top of that I still get thoes errors above spammed?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jeremy Quirke - 2017-08-14

Yeah still almost certainly CPU limited mate, what I'm guessing is the single biggest issue is because that CPU is running everything, the L1/L2 caches are getting cleared out between the math-heavy stuff (process_dl) which is hurting performance.

Not to mention hardware interrupts and bottom halves (especially USB controller ones that copy buffers around) will have no choice but to be assigned to that single CPU which will also likely cause cache evictions etc.

The fact that 1.4MHz is getting somewhere points to this.
Again, 2 cores should be enough. I did the experiment on my system locking the affinity to 2 cores only and it ran at 5MHz without a glitch.

Last edit: Jeremy Quirke 2017-08-14

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

OpenLTE B200 / Clock Drift?

An open source 3GPP LTE implementation.

Forums

Help

OpenLTE B200 / Clock Drift?

OpenLTE B200 / Clock Drift?

An open source 3GPP LTE implementation.

Forums

Help

OpenLTE B200 / Clock Drift? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

OpenLTE B200 / Clock Drift?