I agree. Whenever I get into an argument online, it’s usually with the understanding that it exists for the benefit of the people who may spectate the argument — I’m rarely aiming to change the mind of the person I’m conversing with. Especially when it’s not even a discussion, but a more straightforward calling someone out for something, that’s for the benefit of other people in the comments, because some sentiments cannot go unchanged.
That’s fun, I’m stealing that
Elsewhere in this thread, you mentioned that Immich has great documentation. Are there any other FOSS projects that stand out to you as having great user documentation?
The data are stored, so it’s not a live-feed problem. It is an inordinate amount of data that’s stored though. I don’t actually understand this well enough to explain it well, so I’m going to quote from a book [1]. Apologies for wall of text.
“Serial femtosecond crystallography [(SFX)] experiments produce mountains of data that require [Free Electron Laser (FEL)] facilities to provide many petabytes of storage space and large compute clusters for timely processing of user data. The route to reach the summit of the data mountain requires peak finding, indexing, integration, refinement, and phasing.” […]
"The main reason for [steep increase in data volumes] is simple statistics. Systematic rotation of a single crystal allows all the Bragg peaks, required for structure determination, to be swept through and recorded. Serial collection is a rather inefficient way of measuring all these Bragg peak intensities because each snapshot is from a randomly oriented crystal, and there are no systematic relationships between successive crystal orientations. […]
Consider a game of picking a card from a deck of all 52 cards until all the cards in the deck have been seen. The rotation method could be considered as analogous to picking a card from the top of the deck, looking at it and then throwing it away before picking the next, i.e., sampling without replacement. In this analogy, the faces of the cards represent crystal orientations or Bragg reflections. Only 52 turns are required to see all the cards in this case. Serial collection is akin to randomly picking a card and then putting the card back in the deck before choosing the next card, i.e., sampling with replacement (Fig. 7.1 bottom). How many cards are needed to be drawn before all 52 have been seen? Intuitively, we can see that there is no guarantee that all cards will ever be observed. However, statistically speaking, the expected number of turns to complete the task, c, is given by: where n is the total number of cards. For large n, c converges to n*log(n). That is, for n = 52, it can reasonably be expected that all 52 cards will be observed only after about 236 turns! The problem is further exacerbated because a fraction of the images obtained in an SFX experiment will be blank because the X-ray pulse did not hit a crystal. This fraction varies depending on the sample preparation and delivery methods (see Chaps. 3–5), but is often higher than 60%. The random orientation of crystals and the random picking of this orientation on every measurement represent the primary reasons why SFX data volumes are inherently larger than rotation series data.
The second reason why SFX data volumes are so high is the high variability of many experimental parameters. [There is some randomness in the X-ray pulses themselves]. There may also be a wide variability in the crystals: their size, shape, crystalline order, and even their crystal structure. In effect, each frame in an SFX experiment is from a completely separate experiment to the others."
“The Realities of Experimental Data” "The aim of hit finding in SFX is to determine whether the snapshot contains Bragg spots or not. All the later processing stages are based on Bragg spots, and so frames which do not contain any of them are useless, at least as far as crystallographic data processing is concerned. Conceptually, hit finding seems trivial. However, in practice it can be challenging.
“In an ideal case shown in Fig. 7.5a, the peaks are intense and there is no background noise. In this case, even a simple thresholding algorithm can locate the peaks. Unfortunately, real life is not so simple”
It’s very cool, I wish I knew more about this. A figure I found for approximate data rate is 5GB/s per instrument. I think that’s for the European XFELS.
Citation: [1]: Yoon, C.H., White, T.A. (2018). Climbing the Data Mountain: Processing of SFX Data. In: Boutet, S., Fromme, P., Hunter, M. (eds) X-ray Free Electron Lasers. Springer, Cham. https://doi.org/10.1007/978-3-030-00551-1_7
Unfortunately no. I don’t know any research scientists who even make 6 figures. You’re lucky to break even 50k if you’re in academia. Working in industry gets you better pay, but not by too much. This is true even in big pharma, at least on the biochemical/biomedical research front. Perhaps non-research roles are where the big bucks are.
He doesn’t directly control anything with C++ — it’s just the data processing. The gist of X-ray Crystallography is that we can shoot some X-rays at a crystallised protein, that will scatter the X-rays due to diffraction, then we can take the diffraction pattern formed and do some mathemagic to figure out the electron density of the crystallised protein and from there, work out the protein’s structure
C++ helps with the mathemagic part of that, especially because by “high throughput”, I mean that the research facility has a particle accelerator that’s over 1km long, which cost multiple billions because it can shoot super bright X-rays at a rate of up to 27,000 per second. It’s the kind of place that’s used by many research groups, and you have to apply for “beam time”. The sample is piped in front of the beam and the result is thousands of diffraction patterns that need to be matched to particular crystals. That’s where the challenge comes in.
I am probably explaining this badly because it’s pretty cutting edge stuff that’s adjacent to what I know, but I know some of the software used is called CrystFEL. My understanding is that learning C++ was necessary for extending or modifying existing software tools, and for troubleshooting anomalous results.
A friend of mine whose research group works on high throughout X-ray Crystallography had to learn C++ for his work, and he says that it was like “wrangling an unhappy horse”.
You’re right, that is pretty funny. I didn’t notice until you pointed it out in this comment
You’ve bamboozled my attempt to make the same joke at your expense by only mentioning one number in your comment, giving me nothing to add to it. From this point on, I conclude we should only ever mention one number in each comment, for clarity.
Thanks for sharing that post, it was super interesting.
I wish I could see behind the scenes in the Windows UI discussions, to see how we get to what we have today
They sound like they’re a bit muddled about how the Equality Act actually works though, bless them, based on this other quote from the same article:
“The qualifications of an ethnic group, there are five of them, and we hit everyone straight in the bullseye.”
There are another few instances where the guy specifically says “ethnic group”, and I even chased up the source to check it wasn’t just rubbish reporting, because as you highlight, it’s not too farfetched that they might be considered a protected minority group, but that’s entirely different than them being considered a minority ethnic group.
Unsurprised that the pro-fox-hunting people have more money than sense. I imagine their lawyers are better versed on the equality act than their spokesman, and are making the beliefs argument rather than the ethnic minority one.
It reminds me of the recent Crowdstrike fiasco: apparently kernel level access was needed for their anti-malware to be able to properly work (because that way their net can cover the entire OS basically), but that high level of access meant that when CrowdStrike fucked up with an update, people’s computers were useless. (Disclaimer, I am not a cybersecurity person and am not offering judgement either way on whether Crowdstrike’s claim about kernel level access was bullshit or not)
In a similar way, in order for identity theft monitoring services to work, they surely will need to hold a heckton of data about you. This is fine if they can be trusted to hold that data securely, but otherwise… ¯\_ (ツ)_/¯
I share your unease, though I don’t feel able to comment on the correctness of your mindset. Though I will say that on an individual level, keeping an eye on your credit reports in general (from the major credit agencies) will go a long way to helping there (rather than paying for serviced that give you a score and other fancy “features”, you can request either free or v. low cost report which just has the important stuff you need to know.)
I also know that if you want to be extra cautious, you can manually freeze your credit so basically no new lines of credit can be opened in your name. This is most useful for people who have already been a victim of fraud, or they expect to be at risk (such as by shitty family, or a data breach). I don’t know how one sets this up, but I know that if you did want to set up a new line of credit, you can call to unfreeze your credit, and then freeze it again when your application for the new credit is all done. I have a friend who has had this as their default for years now because of shitty family.
I agree that there’s a strong incentive for even entirely self-interested people to cooperate. I was listing altruism as one of many pro-social behaviours, not as a subset or requirement for cooperation
I’d argue that capitalism is unnatural because even if we work from the assumption that resource hoarding is natural, it’s also necessary to take into account the fact that evolutionarily, humans got to where we are via traits like altruism, cooperation and forming communities. Capitalism is far from natural — it’s an insidious subversion of human nature
I went to a Know Your Rights training ran by the green and black cross, they’re great.
I had to do it for the first time last year and I was slightly giddy from the novelty of it.
My (somewhat speculative) impression is that the shock that he expresses about receiving the letter isn’t just at the letter itself, but at a legal system that is letting this happen to the extent that it happens at all. By that, I mean that the rule of law is most powerful when it’s acting preventatively — when people are deterred from breaking laws before they break them.
The people who sent the death threats believed they could intimidate without fear of reprisal from the law, and it seems they were right. I can imagine how this might be jarring to someone working to fight modern slavery, where the law is one of the tools used against employers who are exploiting workers in this manner.
This is an excellent comment, thanks for writing this up
Oh my gosh, I can’t believe I never thought of this before, the parody song practically writes itself
Last year, I called out a friend for excessively blaming the Greens for various local council decisions/inefficiencies. They had the impression that the Greens had far more seats than they actually did (iirc, they only had 2, out of a total of almost 40). When I pointed this out to them, they were surprised, and we later reflected that they had likely inadvertently bought into propaganda that scapegoats the Greens.
One of the projects that the Greens had most loudly been opposed to in the area was one that looked like some genuinely pretty dodgy developments as part of a failing scheme led by councillors who had approved a bunch of other half complete failures.