More

thebeardedone · on May 4, 2024

It's been a while but I would benchmark how ClamAV fares. At $(day job) we implemented scanning for file uploads and after our security team tested they noticed that it detected things fairly poorly (60% was on the high side) for very well known malicious content iirc.

We ultimately scrapped it, if anyone else has any better experience I'd love to hear it.

thebeardedone · on April 7, 2018

I have been victim of 3 evos myself, i didnt get very far into the configuration process (db, transmission, etc) before them more or less being rendered useless. The ssd evos are great but the sd cards are simply horrible, tragically I saw them locally bundled with pi starter kits. I ended up buying a sandisk which has not had a single hiccup even after a little bit of abuse.

thebeardedone · on Feb 8, 2018

ClojureCL looks quite awesome, it looks like it takes all of the boiler plate out in comparison to C but I am very interested in the overhead. I tried aparapi for my bachelor thesis and for larger problems it seemed to have quite the overhead that the normal kernel did not have (several years ago). After watching your talk on youtube [1], I saw you compared the GPU implementation to the JVM and other CPU implementations in Clojure. Do you have any numbers for the same kernel being called from a native C openCL application?

[1] https://www.youtube.com/watch?v=bEOOYbscyTs

dragandj · on Feb 8, 2018

For equivalent code, you will not experience any noticeable difference compared to C. It is nothing like Aparapi.

thebeardedone · on Jan 21, 2018

I recently had a similar experience; our product at work is a monolith not in the greatest shape as it has technical debt which we inherited and our product is usually used condescendingly when talking to other teams working on different products. To our surprise when we started testing it with cloud deployments, it was really lightweight compared to just one of the 25 java micro-services from the other teams.

Their "microservices" suffered from the same JVM overhead and to remedy this they are joining their functionalities together (initially they had 30-40).

virmundi · on Jan 22, 2018

I'm switching to go partially due to the jvm. Hopefully I'll get better partitioning on a single small box as I start.

thebeardedone · on Jan 10, 2018

Moritz Lipp's twitter is actually interesting to follow. He is reconstructing images which do not fit into cache. Quite amazing.

https://twitter.com/mlqxyz/status/950378419073712129

(I personally do not have a twitter account but was looking for the paper and stumbled upon it, glad I did!)

thebeardedone · on Jan 4, 2018

It really depends on what you are trying to do with it, for example we have around 15 different integration test configurations that run every night for which VMs may better suited as we want to test installation and deployments automatically for 3 distributions (6 different versions ex. Ubuntu 14.04, ubuntu 16.04, centos 6, centos 7 etc..) and the last 2 windows server. The good thing is that they reproduce the customer environment.

But they have large down sides as well which slow us down. They are a pain to maintain as they are somewhat undocumented (you make a poc for 1 and management always wants more without improvments), a lot of edge cases cause issues which are tough to reproduce (locally sometimes impossible and waste a lot of time) and it takes them a while to start, run, etc.

This is not too tragic for nightly tests as we get the results in the morning but for tests which are started every hour, you do not want to wait that long to verify your changes work/didnt break anything. You can do these in stages, where you create different images based on the result of a previous job (run basic tests that cover base functionality that should always work, then run more in depth tests, then run performance tests at the end to ensure no significant degradation was introduced, etc..) and send out notifications asap in case of failure. The Dockerfile is essentially the documentation as you can see what is installed/configured. You can run everything locally just as it would in a k8 env. which for some reason every one always struggles with.

I am sure there are also edge cases with Docker that are a pain as well but the other selling points show it may be the right direction. You just havevto find use cases and evaluate them.

mbreese · on Jan 4, 2018

I'm not sure what that has to do with Singularity?

I know that the HPC clusters I've used in the past few years have all supported Singularity, but none have supported Docker (aside from our small lab cluster). Many HPC admins are (understandably) hesitant to allow non-admins access to start Docker containers (requiring root), but Singularity has no such user permission issues -- and it's faster than initializing a full VM to run a job. I don't expect that to change so long as starting a container requires root-effective permissions.

I suspect that many data scientists will be in a similar situation w.r.t HPC clusters (except for those that are using custom clouds like Seven Bridges).

thebeardedone · on Jan 5, 2018

Sorry for the mix up, I was replying to rb808's post using the HN app on my tablet, no idea what happened :(.

Regarding HPC, from what I remember they usually have old kernels which are running (2.6) for compatibility reasons where Docker usually is not supported (unless it is backported like in RHEL).

thebeardedone · on Dec 26, 2017

Amusingly I had a similar experience while holding a tutorial for math 2 at my university. After an hour a student asked me what the integral symbol is, as if she has never seen it before. Now I do not find this very tragic as I would gladly recap anything and working throughout my studies I know one might not be able to get through all of the content, but good luck studying when you are missing the basics to even understand the problem at hand.

WalterBright · on Dec 27, 2017

In freshman physics at Caltech I cancelled out the d's in dx/dt. But I knew something was terribly wrong, and I sought out the professor after lecture. He asked me if I knew what a derivative was, and I said no. He suggested I come to his office later.

He gave me a half hour lecture illustrating the basics of integration and differentiation. It was the most useful half hour of my life, and it got me through the semester. Prof. Gomez pretty much saved my @ss.

No, my public high school had not offered calculus in any way, shape or form. My fellow freshmen couldn't believe I had come to Caltech utterly ignorant of calculus.

mreome · on Dec 27, 2017

If you were allowed in a calculus based physics course (some schools offer non-calc versions for non-science/engineering students) without ever being asked if you had taken calculus, there was an issue here other then you not having taken calculus. Calculus is not part of the standard curriculum at many US high schools. I attended one of the top 10 engineering schools in the US, and it was not that unusual for students to need to take calc before taking their first physics course. This kind of basic prerequisite verification is something any major university should be doing.

WalterBright · on Dec 27, 2017

Calculus was required in pretty much all of the freshmen classes. There was no possibility of deferring that in order to learn calculus - it should have been done before the fall. Remedial classes were not offered.

As far as I recall, I was the only freshman who did not know calculus. I can only infer from that that I did not get the message, or ignored it. I certainly received no useful guidance from high school.

thebeardedone · on Dec 27, 2017

Thank you for the reply but sadly there is another issue.

Similarly to you, I was educated in North America where ironically integrals were not part of the curriculum, but when I arrived in Europe I had to take 3 entrance exams: English (the exam is very awkward for the native speakers, you get 2 questions: what is your name and why are you hear after they notice that you do not have an accent), math (their curriculum with the missing content I did not have) and German.

As I understand, if a local does not have a matura in the required fields, they have to take similar examinations which should prepare them.

I think you had a very nice professor though.

WalterBright · on Dec 27, 2017

I had other problems too - did not know how to study, did not know how to take notes, did not know how to organize my time, I'd never had to work at school before, thought I could skip lecture and wing the exams, etc.

It took until my sophomore year to get my act together. Fortunately, freshman year at Caltech was pass/fail, and I barely scraped by that year.

ggtyh · on Dec 27, 2017

What were some of the methods you developed and used?

WalterBright · on Dec 27, 2017

It was straightforward:

1. attend all lectures

2. do all of the homework on time

3. make sure I understood how to do 100% of the homework problems

4. get help with anything I didn't understand

5. write down everything that the prof wrote on the blackboard

6. clean up my handwriting so it was legible

7. lectures/homework/studying always came first

fuzzfactor · on Dec 27, 2017

Exactly what worked for me back then, except in Chemistry in addition to copying from the board I also wrote down everything the professor said along with what I thought about what was presented and was writing so fast it made No.6 completely impossible.

Ever since, when working with someone who has difficult-to-decipher handwriting I advise them that their handwriting is just fine for themselves, so when you need others to read it correctly, in those situations approach it like calligraphy instead.

Naturally I typed my homework and did "calligraphy" on exams.

Also the homework contributed so little credit toward your final grade that a single wrong answer on a midterm or final counted against you mathematically more so than turning in no homework at all, so some good students treated it like they could make up for many hours shortage on homework in just five minutes of the time spent acing the exams. But it turned out to be impossible to get an A for the course if all of the homework was not completed and all correct as well. This was by design and for students to figure out for themselves. They didn't give very many A's in Chemistry.

WalterBright · on Dec 27, 2017

At Caltech, the grade was totally based on the midterm and final exams. The homework only counted if you were on the boundary between one grade and another. In fact, the homework often wasn't even graded, you'd do the homework and then attend a "recitation" session where a grad student would go over it with the students.

The point of the homework was to teach you the material. Skip it at your peril. Mastering it meant you were reasonably prepared for the exams.

thebeardedone · on Dec 24, 2017

Care to elaborate? I havent done any frontend stuff in a few months but it appears to be the only solution there. Even when using typescript, you still want and need to know js.

Regarding Java, i think its a good starting point if you want to hack your first web project together. You sort of see how everything works together. Before that I always recommend looking at C just to know what you are missing out on. If there is some calculation happening you can always call some C/C++ code to see if you get some speed ups by optimizing for the cache (just an interesting experiment if youve never done it before.

thebeardedone · on Dec 24, 2017

Woops read the wrong parent comment. Ignore me :).

thebeardedone · on Dec 10, 2017

I highly recommend watching Danny Rensch's analysis of games 3 and 5 as it is interesting to see the (human) reasoning behind the moves (they are linked on the page). One should also be very skeptical about the "empty" moves from Stockfish. While watching I was surprised that it was making quite a few of these but what is not explained in the video is the reason why it occurred. Nakamura hints on it by mentioning that Stockfish was running on laptop grade hardware and in the post it is mentioned that there was a 1 min/move time limit (not to mention the very low cache). Additionally people who replayed the moves on their own laptops using Stockfish like [1] were unable to reproduce some of the moves.

Although being quite an achievement, I would love to see how Stockfish fares with some minor tweaks. I mean AlphaZero did not lose a single game under very specific conditions, nobody likes a poor winner.

[1] https://chess.stackexchange.com/a/19378

thebeardedone · on Dec 9, 2017

I did not know that there is a clause in the license agreement about not being allowed to disclose benchmarks of databases.

Recently I came into discussion regarding why someone chose to use Oracle instead of Postgres; the argument that someone brought up was that they did not know how Postgres scaled. After pointing out that the data that will be stored will be most likely be a few hundred GB (in an exaggerated worst case scenario) and that Postgres is said to handle 100's of TB of data, they capitulated and said that customers trust Oracle (even though they never see or touch the database).

Personally I would be very interested in seeing comparisons between PG, Oracle and MSSQL as well for different data-sets/use-case scenarios. This would really help as a reference in the future when someone else is making critical decisions which might not appear to make sense.

EDIT: This sounds like a very shady clause; anyone with law insights know whether this would be enforceable in the EU?

DenisM · on Dec 9, 2017

There is a direct comparison, it’s called TPCC TPCD TPCE TPCR family of benchmarks. Each company tries to make themselves look good with those, but they have to follow the rules.

ghaff · on Dec 9, 2017

I haven't dealt with those for a while but, as I recall, all the hardware and software companies in the benchmark (which isn't a comparison except insofar as other companies have also published a benchmark) have to agree to publish. Furthermore, the results are audited and there are various rules about how you can use and talk about the benchmark. (And they cost BIG money to run--into the millions $.)

leggomylibro · on Dec 9, 2017

It is very shady, but also very prevalent. I think that even FreeRTOS - an open-source embedded RealTime Operating System library - had it until Amazon recently MIT'd it.

"If our code is not the best, you MUST NOT TELL ANYONE." Yeah, that's healthy...

user5994461 · on Dec 9, 2017

Postgre doesn't handle hundreds of TB of data.

nl · on Dec 9, 2017

10 years ago it could handle 2PB. https://www.computerworld.com/article/2535825/business-intel...

anarazel · on Dec 9, 2017

It's postgres or postgresql, not postgre.