• 1 Post
  • 87 Comments
Joined 1 year ago
cake
Cake day: June 9th, 2023

help-circle


  • It’s frustrating how common IQ based things are still. For example, I’m autistic, and getting any kind of support as an autistic adult has been a nightmare. In my particular area, some of the services I’ve been referred to will immediately bounce my referral because they’re services for people with “Learning Disabilities”, and they often have an IQ limit of 70, i.e. if your IQ is greater than 70, they won’t help you.

    My problem here isn’t that there exists specific services for people with Learning disabilities, because I recognise that someone with Down syndrome is going to have pretty different support needs to me. What does ick me out is the way that IQ is used as a boundary condition as if it hasn’t been thoroughly debunked for years now.

    I recently read “The Tyranny of Metrics” and whilst I don’t recall of it specifically delves into IQ, it’s definitely the same shape problem: people like to pin things down and quantify them, especially complex variables like intelligence. Then we are so desperate to quantify things that we succumb to Goodhart’s law (whenever a metric is used as a target, it will cease to be a good metric), condemning what was already an imperfect metric to become utterly useless and divorced from the system it was originally attempting to model or measure. When IQ was created, it wasn’t nearly as bad as it was. It has been made worse by years of bigots seeking validation, because it turns out that science is far from objective and is fairly easy to commandeer to do the work of bigots (and I say this as a scientist.)



  • Congrats! I appreciate this post because I want to be where you are in the not too distant future.

    Contributing to Open Source can feel overwhelming, especially if working outside of one’s primary field. Personally, I’m a scientist who got interested in open source via my academic interest in open science (such as the FAIR principles for scientific data management and stewardship, which are that data should be Findable, Accessible, Interoperable and Reusable). This got me interested in how scientists share code, which led me to the horrifying realisation that I was a better programmer than many of my peers (and I was mediocre)

    Studying open source has been useful for seeing how big projects are managed, and I have been meaning to find a way to contribute (because as you show, programming skills aren’t the only way to do that). It’s cool to see posts like yours because it kicks my ass into gear a little.





  • Last year, I called out a friend for excessively blaming the Greens for various local council decisions/inefficiencies. They had the impression that the Greens had far more seats than they actually did (iirc, they only had 2, out of a total of almost 40). When I pointed this out to them, they were surprised, and we later reflected that they had likely inadvertently bought into propaganda that scapegoats the Greens.

    One of the projects that the Greens had most loudly been opposed to in the area was one that looked like some genuinely pretty dodgy developments as part of a failing scheme led by councillors who had approved a bunch of other half complete failures.





  • The data are stored, so it’s not a live-feed problem. It is an inordinate amount of data that’s stored though. I don’t actually understand this well enough to explain it well, so I’m going to quote from a book [1]. Apologies for wall of text.

    “Serial femtosecond crystallography [(SFX)] experiments produce mountains of data that require [Free Electron Laser (FEL)] facilities to provide many petabytes of storage space and large compute clusters for timely processing of user data. The route to reach the summit of the data mountain requires peak finding, indexing, integration, refinement, and phasing.” […]

    "The main reason for [steep increase in data volumes] is simple statistics. Systematic rotation of a single crystal allows all the Bragg peaks, required for structure determination, to be swept through and recorded. Serial collection is a rather inefficient way of measuring all these Bragg peak intensities because each snapshot is from a randomly oriented crystal, and there are no systematic relationships between successive crystal orientations. […]

    Consider a game of picking a card from a deck of all 52 cards until all the cards in the deck have been seen. The rotation method could be considered as analogous to picking a card from the top of the deck, looking at it and then throwing it away before picking the next, i.e., sampling without replacement. In this analogy, the faces of the cards represent crystal orientations or Bragg reflections. Only 52 turns are required to see all the cards in this case. Serial collection is akin to randomly picking a card and then putting the card back in the deck before choosing the next card, i.e., sampling with replacement (Fig. 7.1 bottom). How many cards are needed to be drawn before all 52 have been seen? Intuitively, we can see that there is no guarantee that all cards will ever be observed. However, statistically speaking, the expected number of turns to complete the task, c, is given by: where n is the total number of cards. For large n, c converges to n*log(n). That is, for n = 52, it can reasonably be expected that all 52 cards will be observed only after about 236 turns! The problem is further exacerbated because a fraction of the images obtained in an SFX experiment will be blank because the X-ray pulse did not hit a crystal. This fraction varies depending on the sample preparation and delivery methods (see Chaps. 3–5), but is often higher than 60%. The random orientation of crystals and the random picking of this orientation on every measurement represent the primary reasons why SFX data volumes are inherently larger than rotation series data.

    The second reason why SFX data volumes are so high is the high variability of many experimental parameters. [There is some randomness in the X-ray pulses themselves]. There may also be a wide variability in the crystals: their size, shape, crystalline order, and even their crystal structure. In effect, each frame in an SFX experiment is from a completely separate experiment to the others."

    The Realities of Experimental Data” "The aim of hit finding in SFX is to determine whether the snapshot contains Bragg spots or not. All the later processing stages are based on Bragg spots, and so frames which do not contain any of them are useless, at least as far as crystallographic data processing is concerned. Conceptually, hit finding seems trivial. However, in practice it can be challenging.

    “In an ideal case shown in Fig. 7.5a, the peaks are intense and there is no background noise. In this case, even a simple thresholding algorithm can locate the peaks. Unfortunately, real life is not so simple”

    It’s very cool, I wish I knew more about this. A figure I found for approximate data rate is 5GB/s per instrument. I think that’s for the European XFELS.

    Citation: [1]: Yoon, C.H., White, T.A. (2018). Climbing the Data Mountain: Processing of SFX Data. In: Boutet, S., Fromme, P., Hunter, M. (eds) X-ray Free Electron Lasers. Springer, Cham. https://doi.org/10.1007/978-3-030-00551-1_7



  • He doesn’t directly control anything with C++ — it’s just the data processing. The gist of X-ray Crystallography is that we can shoot some X-rays at a crystallised protein, that will scatter the X-rays due to diffraction, then we can take the diffraction pattern formed and do some mathemagic to figure out the electron density of the crystallised protein and from there, work out the protein’s structure

    C++ helps with the mathemagic part of that, especially because by “high throughput”, I mean that the research facility has a particle accelerator that’s over 1km long, which cost multiple billions because it can shoot super bright X-rays at a rate of up to 27,000 per second. It’s the kind of place that’s used by many research groups, and you have to apply for “beam time”. The sample is piped in front of the beam and the result is thousands of diffraction patterns that need to be matched to particular crystals. That’s where the challenge comes in.

    I am probably explaining this badly because it’s pretty cutting edge stuff that’s adjacent to what I know, but I know some of the software used is called CrystFEL. My understanding is that learning C++ was necessary for extending or modifying existing software tools, and for troubleshooting anomalous results.






  • They sound like they’re a bit muddled about how the Equality Act actually works though, bless them, based on this other quote from the same article:

    “The qualifications of an ethnic group, there are five of them, and we hit everyone straight in the bullseye.”

    There are another few instances where the guy specifically says “ethnic group”, and I even chased up the source to check it wasn’t just rubbish reporting, because as you highlight, it’s not too farfetched that they might be considered a protected minority group, but that’s entirely different than them being considered a minority ethnic group.

    Unsurprised that the pro-fox-hunting people have more money than sense. I imagine their lawyers are better versed on the equality act than their spokesman, and are making the beliefs argument rather than the ethnic minority one.


  • It reminds me of the recent Crowdstrike fiasco: apparently kernel level access was needed for their anti-malware to be able to properly work (because that way their net can cover the entire OS basically), but that high level of access meant that when CrowdStrike fucked up with an update, people’s computers were useless. (Disclaimer, I am not a cybersecurity person and am not offering judgement either way on whether Crowdstrike’s claim about kernel level access was bullshit or not)

    In a similar way, in order for identity theft monitoring services to work, they surely will need to hold a heckton of data about you. This is fine if they can be trusted to hold that data securely, but otherwise… ¯\_ (ツ)_/¯

    I share your unease, though I don’t feel able to comment on the correctness of your mindset. Though I will say that on an individual level, keeping an eye on your credit reports in general (from the major credit agencies) will go a long way to helping there (rather than paying for serviced that give you a score and other fancy “features”, you can request either free or v. low cost report which just has the important stuff you need to know.)

    I also know that if you want to be extra cautious, you can manually freeze your credit so basically no new lines of credit can be opened in your name. This is most useful for people who have already been a victim of fraud, or they expect to be at risk (such as by shitty family, or a data breach). I don’t know how one sets this up, but I know that if you did want to set up a new line of credit, you can call to unfreeze your credit, and then freeze it again when your application for the new credit is all done. I have a friend who has had this as their default for years now because of shitty family.