Why decentralised tracing is wrong.

14 05 2020

Let me be clear, I am not associated or working on any tracing app. I am a user. Let’s also be clear. Covid-19 is a frightening virus which has a horrible impact on some. 30% of patients dont make it out of ICU. The impact on some ethnic minorities is disproportionate. If you have a BMI over 30 your 37% more likely not to survie.

Our aim as a society is to eliminate it. Every tool available needed in this fight, and technology is one of those tools. I an ideal world, every person would have the latest mobile device, with the latest OS. They would always have the device turned on and operating. When they became ill they would self diagnose with 100% accuracy and honestly report to the app which would dutifully report to all contacts. The world is not ideal, its messy and imperfect. Covid-19 jumped into humans. Most people even in 1st world countries don’t have a mobile device that will run the app. (evidence, Isle of White, 80% of vulnerable people (elderly) and poorer families have old iPhones pre iOS 10. I much smaller number have Android pre 6.0). The population are not clinicians and they are scared. They will over report, misdiagnose and even in rare cases maliciously report. In Germany, the decentralised app requires evidence of a positive Covid-19 test to combat this. In Germany that combats the incorrect reports, as Germany has a surplus of testing capacity. In other countries (eg UK, probably US) health care and social care workers struggle to get tested.

Technology for tracing is an assistive tool, but must augmented by other techniques. Without centralised anonymised data, it is impossible to perform any of the other standard epidemiological processes that eliminate the spread of infections. Most countries have done this for years on a small scale ensuring that sexually transmitted diseases have not become pandemic.

When a parent, who doesn’t drop their primary school child at school becomes ill and notifies the decentralised tracing app who will tell the school. Who will tell the parent of other children in the same class at the school. Who will tell the care homes where the parent of the other children in the same class at the school that look after the 85 year old loved grandparents, none of whom anywhere in that chain had a mobile device capable of running the app notified by some decentralised method that knows nothing about humans and the messy world we live in.

In the centralised tracing model, the team of human tracers have access to that information and will phone the school, get the list of parent, warn them to self isolate. The few Covid-19 tests that are available will be used by the social care workers and hopefully they won’t infect the 85 year old loved grandparents.

In addition because we said, this is different, this isn’t about a miniscule and irrelevant risk to privacy but about the lives of my loved ones, the health organisations we believe in and trust have the data they need to protect us all. We unknowingly give far more personal tracking data to internet giants in return for being recommended 2m videos of almost no relevance. Of course Apple and Google want to force us to use the decentralised model. They don’t want their credentials for protecting privacy to be questioned. The mechanisms in use drive their business models. And they don’t care if you don’t have a phone you can’t run it on, your not a customer. Decentralised tracing only protectects wealthy Apple customers. Google is better, a 10y old Google Nexus will run the UK NHS App.

It reminds me of when Apple  or Google (can’t remember which) developed some AI for a health feature but couldn’t understand why it was reporting health conditions monthly for just over half the population between the ages of 14 and 50, until the one woman in the meeting asked if there were any females in the set of training data. Dumb mistake. Period.  Not realising the world is real is a big mistake to be making again.





Contact Tracing with Privacy Built in.

19 04 2020

Contact tracing is a n squared or more computationally intensive operation when done centrally. It is impossible to do at scale manually, with each operation taking days. For governments to perform it centrally requires citizens give up their rights to privacy, especially when a contract tracing app is being used. And yet, it seems, most central contract tracing apps appear to want to gather all the data centrally. Given the current urgency this will not work and will require months of negotiation. I am not a Politician or a Lawyer, I am an Engineer.  As an Engineer this is how I would perform contact tracing.

BlueTooth LE (BLE) measures the signal strength of BLE transmitters and estimates the distance. Each transmitter has a unique ID. Some information, such as make of device, can be derived from the unique ID, but the ID anonymous to the receiver unless it tries to de-anonymise it with a centralised service.

On 20 Feb 2020 I was flying back from Basel to Luton on an EasyJet flight. I was curious about privacy on planes and BLE, so I ran a BLE scanner to who was around me. 100s of BLE transmitters were present. Phones, lots of Apple earpods, some VR headsets but I could see no-one wearing them. The scanner reported distance from me in real time, including when the drinks trolly with its credit card device went down the aisle. Just before the flight took off, 2 Chinese males came on-onboard (the writing on the laptop was Chinese). They both wore face-masks. The younger looked exhausted and slept all the way to London. They were clearly alert to coming into contact with anyone. They sat in their own row and asked someone to move a coat without touching it.

Provided a the BLE UIDs are not stored centrally or parsed they do not create privacy issues. An app for contract tracing would collect all UIDs it sees and record the time period spent under a certain distance and for how long. It would store this data on the phone and never allow it to leave the phone. It might also hash the UID internally before storing it. It would never send any of the data off the phone.

When someone becomes ill, they open the app and say they are ill. The app then asks for permission to send only their ID to a central server. Provided they agree, their BLE UID is sent to the server. The server  stores that UID in a list and other apps on other phones download the list.

The phone then does contract tracing. It hashes the UID with its one key and compares the hashed value against the list of hashed values stored. If it finds a match it can inform the owner of the phone.

The central server has a list of UIDs identifying those claiming to be COVID-19 positive (globally 75K/day limited by testing, probably more like 300K/day, see CSSE for data). It doesn’t have any information on who that person interacted with. It doesn’t have the number, distance time or anything.  The database on the phone doesnt have UIDs, only hashed versions of the UIDS. It can only check if a UID that is has been given is in the database. It can’t tell anyone which UIDs are in the database.

The app does require that the person who claims to be COVID-19 positive is willing to let the central server know. And this does require that the contact tracing app is allowed to capture BLE UIDs in the background, which is is currently not allowed by Apple or Google.

The app relies a lot on trust. Trust that it wont send the UIDs or database to a central server without the explicit permission of the owner of the phone, and to allow that UID to be broadcast to the world. It does therefore have privacy concerns but only at the point of being confirmed as a carrier.  Every phone must perform the checks against its database.

 

How would it be done centrally ?

Centrally, the UID database on the phone would have to be stored centrally. When a confirmed case came it, every contact in that database stored centrally would be notified. This is less computationally intensive. Privacy could be maintained by encrypting the database with a key known only to the phone and released only with the consent of the user. Less broadcasting, but the same requirement to build trust in the the app.

 

BTW, the are BLE Beacons in many public spaces (shops), so an outbreak location can also be tracked with relative ease, but unless it recorded all visitors, it would not be able to notify. (at least not in the centralised model).

 

However, technology is not everything. Many infections may happen with no person present as the virus lives on hard surfaces for upto 72h. Something will be better than nothing, but won’t be perfect.





Fouling

3 05 2017

Screen Shot 2017-05-03 at 08.36.01

Fouling for any boat, large or small eats boatspeed, fuel and satisfaction. Most boats haul out every year, pressure wash off or scrape off the marine flora and fauna that have taken up residence. Owners then repaint the hull with a toxic antifouling paint. Toxic to marine life, and unpleasant to human. In small boats the toxicity has been greatly reduced over the years, some would argue so has the effectiveness. The problem is the paint erodes and the toxins leak. High concentrations of boats lead to high concentrations of toxins.

For larger ships the cost of antifouling is significant. Interruption to service and the cost of a period in dry dock. Antifouling on large ships is considerably more toxic than available for the pleasure boat industry to extend the periods between maintenance to several years.

About 10 years ago I coated my boat with copper particles embedded in epoxy. The exposed copper surface of the particles reacts with sea water to create a non soluble copper compound. This doesn’t leach, but like clipper ships coated in solid copper sheets discourages marine fouling. I have not painted since. Until a few years ago I did need to scrub off. I would see a few barnacles and some marine growth, but no more than would be normal with toxic paint.

A few years ago I added 2 ultrasonic antifouling units. These send low power ultrasonic pulses into the hull and the water. According to research performed at Singapore University, barnacle larvae use antennae  to feel the surface they are about to attach to. Once attached the antenna stick to the surface and convert into the shell. Once attached, they never fall off. Ultrasound at various frequencies excites the antenna which disrupts the sensing process. The larvae swim past to attach elsewhere. There is also some evidence that ultrasound reduces algae growth. The phenomena was first discovered testing submarine sonar over 50 years ago. My uncle did submarine research in various scottish locks during the war.

The ultrasonic antifouling I have fitted currently was a punt. 2 low cost units from Jaycar, first published in an Australian electronics magazine, that you put together yourself. Those are driven by a 8 pin PIC driving 2 MOSFETS. I think it’s made a difference. After a year in the water I have no barnacles and a bit of soft slime. There are expensive commercial units available at more cost, but the companies selling them seem to come and go. I am not convinced enough to spend that sort of money but I am curious and prepared to design and build a unit.

The new unit (board above), for the new boat is a bit more sophisticated.  Its a 6 channel unit controlled by an Arduino Mega with custom code, controlling a MOSFET signal generator and 6 pairs of MOSFETS. It outputs 15-150Khz at up to 60W per channel. A prototype on the bench works and has my kids running out the house (they can hear the harmonics). My ears are a little older so can’t, but still ring a bit. I won’t run it at those levels as that will likely cavitate the water and kill things as well as eat power. It remains to be seen if the production board works, I have just ordered a batch of 5 from an offshore fabrication shop.





Metrics off Grid

10 04 2017

I sail, as many of those who have met me know. I have had the same boat for the past 25 years and I am in the process of changing it. Seeing Isador eventually sold will be a sad day as she predates even my wife. The site of our first date. My kids have grown up on her, strapped into car seats when babies, laughing at their parents soaked and at times terrified (parents that is, not the kids). see https://hallbergrassy38forsale.wordpress.com/ for info.

The replacement will be a new boat, something completely different. A Class 40 hull provisioned for cruising. A Pogo 1250. To get the most out of the new boat and to keep within budget (or to spend more on sails) I am installing the bulk of the electronics. In the spare moments I get away from work I have been building a system over the past few months. I want get get the same sort of detailed evidence that I get at work. At work, I would expect to record 1000s of metrics in real time from 1000s of systems. A sailing boat is just as real time, except it’s off grid. No significant internet, no cloud, only a limited bank of 12v batteries for power, and a finite supply of diesel to generate electricity, in reality no 240v, but plenty of wind and solar at times.  That focuses the mind and forces implementation to be efficient. The budget is measured in Amps or mA as it was for Apollo 13, but hopefully without the consequences.

Modern marine instruments communicate using a variant (NMEA2000) of a CAN Bus present in almost every car for the past 10 years, loved by car hackers. The marine variant, adds some physical standards mostly aimed at easing amatuer installation and waterproofing. The underlying protocol and electrical characteristics are the same as a standard CAN Bus. The NMEA2000 standard also adds message types or PGNs specific to the marine industry. The standard is private, available only to members, but OpenSource projects like CanBoat on GitHub have reverse engineered most of the messages.

Electrically the CAN Bus is a twisted pair with a 120Ohm resistor at each end to make the pair appear like an infinite length transmission line (ie no reflections). The marine versions of the resistors or terminators come with a marine price tag, even though they often have poor tolerances. Precision 120Ohm resistors are a few pence, and once soldered and encapsulated will exceed any IP rating that could be applied to the marine versions. The Marine bus runs at 250Kb/s slower than many vehicle CAN bus implementations. Manufacturers add their own variants for instances Raymarine SeatalkNG which adds proprietary plugs, wires and connectors.

My new instruments are Raymarine, a few heads and sensors, a chart plotter and an Autopilot. They are basic with limited functionality. Had I added a few 0s to the instrument budget I would have gone for NKE or B&G which have a “Race Processor” and sophisticated algorithms to improve the sensor quality including wind corrections for heal and mast tip velocity, except, the corrections allowed in the consumer versions are limited to a simple linear correction table. I would be really interested to see the code for those extra 0s assuming it’s not mostly on the carbon fibre box. This is where it gets interesting for me. In addition to the CanBus project on GitHub there is an excelent NMEA2000   project targeting Arduino style boards in C++ and SignalK that runs on Linux. The NMEA2000 project running on an Aduino Due (ARM-M3 core processor) allows read/write interfacing to the CAN Bus, converting the CAN protocol into a serial protocol that SignalK running on a Pi Zero W can consume. SignalK acts as a conduit to an Instance of InfuxDB which captures boat metrics in a time series database. The metrics are then viewed in Grafana in a web browser on a tablet or mobile device.

I mentioned power earlier. The setup runs on the Pi Zero W with a load average of below 1 the board consuming about 200mA (5V). The Arduino Due consumes around 80mA@5V most of the time. There are probably some optimisations on to IO that can be performed in InfluxDB to minimize the power consumption further. Details of setup are in https://github.com/ieb/nmea2000. An example dashboard showing apparent wind and boat speed from a test dataset. This is taken from the Pi Zero W.

GrafanaAparentWind

Remember those expensive race processors. The marketing documentation talks of multiple ARM processors. The Pi Zero W has 1 ARM and the Arduino Due has another. Programmed in C++ the Arduino has ample spare cycles to do interesting things. On sailing boats, performance is predicted by the designer and through experience presented in a polar performance plot.

Pogo1250Polar

A polar plot showing expected boat speed for varying true wind angles and speeds. Its a 3D surface. Interpolating that surface of points using bilinear surface interpolation is relatively cheap on the ARM giving me real time target boat speed and real time % performance at a rate of well above 25Hz. Unfortunately the sensors do not produce data at 25Hz, but the electrical protocol is simple. Boat speed is presented as pulses at 5.5Hz per kn and wind speed as 1.045Hz per kn.  Wind angle is a sine cosine pair centered on 4V with a max of 6V and a min of 2V. Converting that all that to AWA, AWS and STW is relatively simple. That data is uncorrected. Observing the Raymarine messages with simulated electrical data I found its data is also uncorrected, as the Autopilot outputs Attitude information, ignored by the wind device. I think I can do better. There are plenty of 9DOF sensors (see Sparkfun) available speaking i2c that are easy to attach to an Arduino. Easy, because SparkFun/Adafruit and others have written C++ libraries. 3D Gyro and 3D Acceleration will allow me to correct the wind instrument for wind  shear, heal and mast head velocity (the mast is 18m tall, cup anemometers have non linear errors wrt angle of heel). There are several published papers detailing the nature of these errors. I should have enough spare cycles to do this at 25Hz, to clean the sensors and provide some reliable KPIs to hit while at sea.

A longer term projects might be teach a neural net to steer, by letting it watch how I steer, once I have mastered that. Most owners complaining their autopilots can’t steer as well as a human. Reinforcement learning in the style of Apha Go could change that. I think I heard the Vendee Globe boats got special autopilot software for a fee.

All of this leaves me with more budget to spend on sails, hopefully not batteries. I will only have to try and remember not to hit anything while looking at the data.

 





Referendums are binary so should be advisory

28 06 2016

If you ask the for the solution to the multi faceted question with a binary question you will get the wrong answer with a probability of 50%. Like a quantum bit, the general population can be in any state based on the last input or observation, and so a Referendum, like the EU Referendum just held in the UK should only ever be advisory.  In that Referendum there were several themes. Immigration, the economy and UK Sovereignty. The inputs the general population were given, by various politicians on both sides of the argument, were exaggerated or untrue. It was no real surprise to hear some promises retracted once the winning side had to commit to deliver on them. No £350m per week for the NHS. No free trade deal with the EU without the same rights for EU workers as before. Migration unchanged. The Economy was hit, we don’t know how much it will be hit over the coming years and we are all, globally, hoping that in spite of a shock more severe than Lehman Brothers in 2008, the central banks, have quietly taken their own experts advice and put in place sufficient plans to deal with the situation. Had the Bank of England not intervened on Friday morning, the sheer cliff the FTSE100 was falling off, would have continued to near 0.  When it did, the index did an impression of a base jumper, parachute open drifting gently upwards.

Screen Shot 2016-06-28 at 20.08.07

The remaining theme is UK Sovereignty. Geoffrey Robertson QC  makes an interesting argument in the Guardian Newspaper, that in order to exit the EU, the UK must under its unwritten constitution vote in parliament to enact Article 50 of the Lisbon Treaty. He argues that the Referendum was always advisory. It will be interesting, given that many of those who have voted now regret their decision, if they try and abandon the last theme that caused so many to want to leave. The one remaining thing so close to their heart that they were prepared to ignore all the experts, believe the most charismatic individuals willing to tell them what they wanted to hear. UK Sovereignty, enacted by parliament by grant of the Sovereign. I watched with interest not least because the characters involved have many of the characteristics of one of the US presidential candidates.

If you live in the UK, and have time to read the opinion, please make your own mind up how you will ask your MP to vote on your behalf. That is democracy and sovereignty in action. Something we all hold dear.





Ai in FM

3 02 2016

Limited experience in either of these fields does not stop thought or research. At the risk of being corrected, from which I will learn, I’ll share those thoughts.

Early AI in FM was broadly expert systems. Used to advise on hedging to minimise overnight risk etc or to identify certain trends based on historical information. Like early symbolic maths programs (1980s) that revolutionised the way in which theoretical problems can be solved (transformed) without error in a fraction of the time, early AI in FM put an expert with a probability of correctness on every desk. This is not the AI I am interested in. It it only artificial in the sense it artificially encapsulates the knowledge of an expert. The intelligence is not artificially generated or acquired.

Machine learning  covers many techniques. Supervised learning takes a set of inputs and allows the system to perform actions based on a set of policies to produce an output. Reinforcement learning https://en.wikipedia.org/wiki/Reinforcement_learning favors the more successful policies by reinforcing the action. Good machine, bad machine. The assumption is, that the environment is stochastic. https://en.wikipedia.org/wiki/Stochastic or unpredictable due to the influence of randomness.

Inputs and outputs are simple. They are a replay of the historical prices. There is no guarantee that future prices will behave in the same way as historical, but that is in the nature of a stochastic system.  Reward is simple. Profit or loss. What is not simple is the machine learning policies. AFAICT, machine learning, for a stochastic system with a large amount of randomness, can’t magic the policies out of thin air. Speech has rules, Image processing also and although there is randomness, policies can be defined. At the purests level, excluding wrappers, financial markets are driven by the millions of human brains attempting to make a profit out of buying and selling the same thing without adding any value to that same thing. They are driven by emotion, fear and every aspect of human nature rationalised by economics, risk, a desire to exploit every new opportunity, and a desire to be a part of the crowd. Dominating means trading on infinitesimal margins exploiting perfect arbitrage as it the randomness exposes differences. That doesn’t mean the smaller trader can’t make money, as the smaller trader does not need to dominate, but it does mean the larger the trader becomes, the more extreme the trades have to become maintain the level of expected profits. I said excluding wrappers because they do add value, they adjust the risk for which the buyer pays a premium over the core assets. That premium allows the inventor of the wrapper to make a service profit in the belief that they can mitigate the risk. It is, when carefully chosen, a fair trade.

The key to machine learning is to find a successful set of policies. A model for success, or a model for the game. The game of Go has a simple model, the rules of the game. Therefore it’s possible to have a policy of, do everything. Go is a very large but ultimately bounded Markov Decision Process (MDP). https://en.wikipedia.org/wiki/Markov_decision_process  Try every move. With trying every move every theoretical policy can be tested. With feedback, and iteration, input patterns can be recognised and successful outcomes can be found. Although the number of combinations is large, the problem is very large but finite. So large that classical methods are not feasible, but not infinite so that reinforcement machine learning becomes viable.

The MDP governing financial markets may be near infinite in size. While attempts to formalise will appear to be successful the events of 2007 have shown us that if we believe we have found finite boundaries of a MDP representing trade, +1 means we have not. Just as finite+1 is no longer finite by the original definition, infinite+1 proves what we thought was infinite is not. The nasty surprise just over the horizon.





What do do when your ISP blocks VPN IKE packets on port 500

12 11 2015

VPN IKE packets are the first phase of establishing a VPN. UDP versions of this packet go out on port 500. Some ISPs (PlusNet) block packets to routers on port 500, probably because they don’t want you to run a VPN end point on your home router. However this also breaks a normal 500<->500 UDP IKE conversation.  Some routers rewrite the source port of the IKE packet so that they can support more than one VPN. The feature is often called a IPSec application gateway. The router keeps a list of the UDP port mappings using the MAC address of the internal machine. So the first machine to send a  VPN IKE packet will get 500<->500, the second 1500<->500, the third 2500<->500 etc. If your ISP filters packets inbound to your router on UDP 500 the VPN on the first machines will always fail to work.  You can trick your router into thinking your machine is the second or later machine by changing the MAC address before you send the first packet. On OSX

To see the current MAC address use ifconfig, and take a note of it.

then on the interface you are using to connect to your network do

sudo ifconfig en1 ether 00:23:22:23:87:75

Then try and establish a VPN. This will fail, as your ISP will block the response to your port 500. Then reset your MAC address to its original

sudo ifconfig en1 ether 00:23:22:23:87:74

Now when you try and establish a VPN it will send a IKE packet out on 500<->500. The router will rewrite that to 1500<->500 and the VPN server will respond 500<->1500 which will get rewritten to 500<->500 with your machine IP address.

How to debug

If you still have problems establishing a VPN then using tcpdump will show you what is happening. You need to run tcpdump on the local machine and ideally on a network tap between the router and the modem. If you’re on Fibre or Cable, then a Hub can be used to establish a tap. If on ADSL, you will need something harder.

On your machine.

sudo tcpdump -i en1 port 500

On the network tap, assuming eth0 is unconfigured and tapping into the hub. This assumes that your connection to the ISP is using PPPoE. Tcp will decode PPPoE session packets, if you tell it to.

sudo tcpdump -i eth0 -n pppoes and port 500

If your router won’t support more than 1 IPSec session, and uses port 500 externally, then you won’t be able to use UDP 500 IKE unless you can persuade your ISP to change their filtering config.