Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My problem with "1/0 = 0" is that it's essentially masking what's almost always a bug in your program. If you have a program that's performing divide-by-zeroes, that's almost surely something you did not intend for. It's a corner case that you failed to anticipate and plan for. And because you didn't plan for it, whatever result you get for 1/0 is almost surely a result that you wouldn't want to have returned to the user.

When encountering such unanticipated corner cases, it's almost always better to throw an error. That prevents any further data/state corruption from happening. It prevents your user from getting back a bad result which she thinks she can trust and rely on. It highlights the problem very clearly, so that you know you have a problem, and that you have to fix it. Simply returning a 0 does the exact opposite.

If you're one of the 0.1% who did anticipate all this, and correctly intended for your program to treat 1/0 as 0, then just check for this case explicitly and make it behave the way you wanted. The authors of pony are welcome to design their language any way they want. But in this case, they are hurting their users far more than they are helping.



An issue with D that has repeatedly engendered heated debate is what should happen when a programming bug is detected at runtime. The two camps are:

1. The program should "soldier on" if it can.

2. The program should go immediately to jail, it must not pass Go, and must not collect $200.

I'm solidly in the latter camp. If a program has entered a state unanticipated by the programmer, then there is no way to know how it entered that state (until it is debugged), and hence no way to know what the program might do next (such as load malware).

1/0 is a bug, and the program should immediately halt.


> If a program has entered a state unanticipated by the programmer, then there is no way to know how it entered that state [...] and hence no way to know what the program might do next (such as load malware).

... or kill the patient ( https://en.wikipedia.org/wiki/Therac-25 ) or set the company bankrupt overnight ( https://en.wikipedia.org/wiki/Knight_Capital_Group ), or simply delete all your data (happened to me with some buggy media player that though my $HOME directory belonged to its download cache).

Programming languages don't generally allow us to "handle", from inside the faulty process, stuff like NULL-pointer dereference, double-free, or segmentation faults. And that's a good thing, because as you said, at this point, there's a very high probability the program is in some rogue state.

(Making special cases for NullPointerException or OutOfBoundsException, like some languages do, is, IMHO, a bad idea, that spreads the confusion between programming mistakes (i.e coming from the source code) and invalid runtime conditions (coming from environment). (I'm avoiding the ambiguous terms "errors" or "bugs" here)).

Making "divide by zero" non-fatal belongs to the same family than making "((int)0)" return zero, or making "((int)0) = x" a no-op. At first, it seems like this will only hide programming mistakes ; but this impression comes from my current programming habits, which are tailored to avoid these situations.

But maybe there are advantages in being able to write these things on purpose (at the cost of losing the runtime detection of some programming mistakes). After all, I'm perfectly happy with "free(NULL)".


> (Making special cases for NullPointerException or OutOfBoundsException, like some languages do, is, IMHO, a bad idea, that spreads the confusion between programming mistakes (i.e coming from the source code) and invalid runtime conditions (coming from environment). (I'm avoiding the ambiguous terms "errors" or "bugs" here)).

Actually if you check out the interviews with Java designers the checked exceptions mechanism was designed for that purpose: unchecked exceptions are for bugs that should never happen in a well written code while checked exceptions are for conditions that can happen regardless to how good is your code (e.g. i/o errors).

See: https://www.artima.com/intv/solid.html

There are of course also Errors for fatal conditions that should been be caught (actually in CLR even if you catch the exceptions will be re-raised automatically at the end of the handler).

Another interesting thing in CLR is Constrained Execution Regions that allow you to run the cleanup code reliably even when such a fatal condition is encountered (but the code to be run is limited e.g. it cannot allocate).


The mis-categorization of some exceptions[0] as checked or not is part of why checked exceptions seem like a mistake. The other is when exceptions cross boundaries of human organizations like nested libraries.

[0] NumberFormatException is a RuntimeError as are all IllegalArgumentException, WTF?


Another is that well-known attempts at checked exceptions in practice provide an incredibly limited vocabulary for talking about raised exceptions - usually just unconditional "might raise X" or "does not raise X". If you could say things like "raises anything raised by my argument f, except for X because I catch that" it seems like it might be a much better idea (especially if you can infer those...)


> (Making special cases for NullPointerException or OutOfBoundsException, like some languages do, is, IMHO, a bad idea, that spreads the confusion between programming mistakes (i.e coming from the source code) and invalid runtime conditions (coming from environment). (I'm avoiding the ambiguous terms "errors" or "bugs" here)).

And that's why I like dual error mechanisms, like the panic vs manual error handling mechanism in Go (despite all its verbosity). It's very important to make the difference between an external error (that should be considered something "normal" to deal with) and an invalid state (that should lead to a halt of the program, maybe after logging something, or before trying to restart the program from zero).

Traditional expection mechanisms (à la try/catch) make it too hard to deal with an external error, and too easy to think you can recover from an invalid state.


D draws a clear distinction between programming bugs (not recoverable) and environmental errors (which can be recoverable).

Failures of the former throw an Error, the latter Exception.


Great analysis! I think that summarizes a lot of dev arguments. That said, Erlang appears to agree with both camps. If an error occurs, the _process_ immediately crashes (not OS process, it's Erlang Kingo for an actor or a green thread), but the _program_ soldiers on as if nothing happened. The idea behind the design, I believe, is that programmer errors are unavoidable, but that eg a bug in some edge case triggered by uncommon data shouldn't crash the entire server.

I don't know much about Pony, but given that it is also actor based, I'd imagine this would be a great opportunity to crash early and eagerly. It gives you all the advantages of catching a bug early while not crippling a production app under what might be a pretty low-impact, low-frequency bug.


Java has essentially the same behavior with threads. If an error occurs in a thread, it will throw an exception that will terminate the thread if uncaught. Other threads are not directly affected. This is typically what you want as threads are usually independent of each other, often processing an isolated request, which may encounter a bug due to bad data or other conditions. Contrast this to languages like C++, where the usual behavior is to segfault and crash the process.

Special care must be taken if threads are manipulating shared data structures, especially inside a "critical section" where the data structure invariants must be maintained. This is why automatically terminating the JVM when OutOfMemoryError occurs is a best practice, as code is often not paranoid enough to handle it.


One big difference between Erlang and Java is that all processes (threads) in Erlang are safely contained in private memory, whereas in Java threads occur in public memory. Another big difference is that Erlang has supervisor processes to restart child processes that have crashed. Java just farts itself.


Indeed. Java is in fact a sort of worst-of-all-worlds in that it's even possible to catch things that are really unrecoverable errors like StackOverflowException or OutOfMemoryError. To to uninitiated it might not like appear that catching these is that bad, right? In addition to the problems with finally clauses not necessarily running to completion under such circumstances catching these exceptions can cause extremely hard-to-debug deadlocks because object locks won't be released properly.

This is an area where the design of Java/JVM is badly screwed up.

[1] JVM, really. It's pretty unavoidable on most (all?) JVM languages.


In C++ the default behaviour is for the runtime to call std::terminate, not segfault. It's easy to catch exceptions by wrapping the callable in a packaged_task and then the exception is available in the future. The key point is that the exception is rethrown when accessing the result through the future.

Swallowing exceptions is a really poor idea. Does Java really do that?


Erlang originates from telecom, where the behavior you describe makes perfect sense. Equipment should work all the time; you don't want to (or even can't) send out people to debug/restart them.


It's not just programmer error, it can soldier on through some hardware errors, rogue cosmic rays, and arguably the hardest - other people's software errors, like os errors, driver errors, library errors, network stack errors...


Regarding "some hardware errors": This might include operationg conditions you didn't consider; say:

The divisor in your code went through a non-ECC DRAM module (maybe in a disk drive) that runs hotter than its specified operating temperature would allow and a random bit-flip changes your `1` into a `0`.

On this topic, the talk on Bitsquatting from Artem Dinaburg during Defcon 19 is worth watching. The most interesting part on bit flips starts around 15:05 minutes: https://youtu.be/9WcHsT97suU?t=15m5s


Adopting this policy was what destroyed the first Ariane 5.[0] IIRC, a conversion from floating-point to a 16-bit integer overflowed, raising an exception which reset the flight control system. The flight control system was redundant, but both copies experienced the same error. The result that was being computed would not have been used.

The lesson I took from this is that neither of these two options is acceptable in software that has to work all the time, like much real-time software. Instead, we should detect all the bugs in such software before runtime, which is only feasible for small systems such as seL4[1]. So we should diligently minimize this software.

For other programs, the best way to handle an error is context-dependent. If a rendering bug will result in a glitch showing in a texture onscreen, that's preferable to exiting the game in the middle of a raid. If a PID motor control system for a robot arm has a division by zero, halting the program without first slowing the motors to a stop could be catastrophic, even causing fatalities. And of course there are numerous cases where continuing execution in the face of such an error is far worse.

[0]: https://web.archive.org/web/20000815230639/http://www.esrin.... [1]: http://sel4.systems/


To say that either (1) or (2) is always the right choice is unreasonable. It depends on context.

If a program is running under a supervisor, it makes sense to bail, as a new clean instance will become available.

If it was the code of the life support machine that displays data on the LEDs that I was connected to, I prefer that it solder on rather than stop functioning altogether because data for one of the digits was out of 0-9 range.

I'm sure there's been many instances where spacecraft had partially malfunctioning software that was remotely corrected because it didn't entirely give up but rather continued to accept input.


Aircraft and spacecraft all have backup systems for critical software. When software self detects a bug (such as an assert failure) the offending computer is shut down and electrically isolated, and the backup is engaged.

What is absolutely NOT done is soldiering on with a computer in an unknown state.

Any software system that is life critical is either designed this way or is very badly engineered. I've written a couple articles about this:

https://www.digitalmars.com/articles/b39.html https://www.digitalmars.com/articles/b40.html


And what if the program is the lowest level and getting into an impossible state due to a hardware issue (bad bit). At some level a program has to soldier-on.


Then it sets an I/O pin that pulls the reset switch.


I cant tell ic you're being serious. The appropriateness of that would depend on whether the hardware condition was persistent in which case we have a reboot loop. It doesn't seem resonable for software meant to run in a harsh envirinment to assume prefect rumtime conditions. Do probabilitites and tradeoffs ever get considered when faced with inconsistent state?


If the program is designed to "soldier on", then why not? The question is how is the program designed to react in face of failure: should it fail safe? And is it safe to crash on any error?

My point of view is that a program should abort if it encounters an irrecoverable error, such as imminent memory corruption. However, if it's designed to fail functional, then it could continue to work, either by reinitialising itself, or by aborting the current operation. However, it must be said that it's hard to design such programs, so the option of bailing out is very attractive.

Example: an out of bounds access is a typical programmer error. This should probably result in an abort during development, so that it can be fixed. However, it is possible to not want to abort just because something is not accessible when in production. The program could instead raise an exception and clean up the call stack up to its initialisation point, then notify the surrounding system that a problem was encountered and resume service. The surrounding system could then keep track of the encountered issues and try to perform a recovery after a threshold is reached: this is how Android recovery works for e.g. crashing system services, if I recall correctly, it first cleans some application settings, then system settings, then does an OS reinstall. Now in this example the trigger condition is a crash, but it doesn't necessarily have to be like that.


I can't say that I'm firmly in either camp.

What seems much more important to me is that the behavior is well documented. If I am working with a system where 1 / 0 is defined to be zero, I will deal with that just as I deal with other peculiarities like 1 / 3 being 0 and 1.0 / 3.0 only approximately being a third in some systems. It's a practical concern like many others in programming.


My comment from when this came up on Reddit, slightly edited for context:

`/`-by-0 is just an operation and it tautologically has the semantics assigned to it. The question is whether the specific behaviour will cause bugs, and on glance that doesn't sound like it would be the case.

Principally, division is normally (best guess) used in cases where the divisor obviously cannot be zero; cases like division by constants are very common, for example. The second most common usage (also best guess) is to extract a property from an aggregate, like `average = sum / count` or `elem_size = total_size / count`. In these cases the result is either a seemingly sane default (`average = 0` with no elements, `matrix_width = 0` when zero-height) or a value that is never used (eg. `elem_size` only ever used inside a loop iterated zero times).

It seems unlikely to me that `/` being a natural extension of division that includes division by zero would be nontrivially error-prone. Even when it does go wrong, Pony is strongly actor-based, has no shared mutability and permits no unhandled cascading failures, so such an event would very likely be gracefully handled anyway, since even mere indexing requires visible error paths. This nothing-ever-fails aspect to Pony is fairly unique (Haskell and Rust are well known for harsh compilers, but by no means avoid throwing entirely), but it gives a lot more passive safety here. This honestly doesn't seem like a bad choice to me.


An example well-defined use case is if you want to compute a harmonic mean, e.g.

x = 2/(1/a +1/b)

This is a form of average where you are giving increased importance to the smaller number. It frequently pops up in science/engineering, e.g. in hydrology when you are computing flow of water underground.

In this case, it's "obvious" that when e.g. a is zero, you want 1/a to be zero so you simply return b. There's typically a clear, big distinction between "smallest realistic number", for instance 1e-3, and a zero, whether actual zero or 1e-34.

For instance Numpy offers a nice way of doing this:

  def safeInv(a):
    tol = 1e-16
    return np.division(1.0,a,out=np.zeros_like(a),where=np.where(a>tol))
This tells numpy to put the result of division into an array of zeros shaped like a, but only do the divide where a>tol, so it returns 0 where a<tol.

In this case, though, the programmer understands this division is somehow special and should allow div-by-zero, so it is handled specially.


Quite the opposite I would say: the harmonic mean can be seen as a very good justification for `1/0=inf` as the pragmatic choice.

If, say, `a` is mathematically very small (and `x` is accordingly expected very small`) but `a` become zero due to rounding behavior, then having `1/a=inf` results in `x=0` which is arguably closer to the expected result than e.g. `x=2b`.

As someone involved with numerical methods, I have very often relied on `1/0=inf` because it would ultimately warrant the proper behavior in case of rounding errors in the denominator.


I think for floating point 1/0=Inf makes a lot of sense, because divisions are frequently done with continuously-varying properties. Division by integers is different, since you do a different kind of work with it; there is no "very small" integer.


"In this case, it's "obvious" that when e.g. a is zero, you want 1/a to be zero so you simply return b"

b obviously isn't what you want from this harmonic mean when a is 0. And in any case x will be 2b if 1/a is 0, not b.


So, I did forget the correction such that you don't get 2b. The general case is described e.g. in the link below, and this is known as a "zero-corrected harmonic mean". It is actually "obvious" in some cases, but I'll concede that in other cases it's not.

https://www.rdocumentation.org/packages/lmomco/versions/2.3....


Eh? That page says nothing about 1/0 being 0, or anything like that. That function takes the harmonic mean of the non-zero samples, and then adjusts the result by multiplying it by the fraction of the samples that are non-zero. It purposefully avoids dividing by 0. And the result of that function for a==0 and b!=0 is b/2, not b or 2b. So not only is what you claim to be obvious not obvious, it's flat-out wrong.


> `average = 0` with no elements

Imagine a Mars lander program that looks at the average of altitude measurements to decide when it's ok to jettison the parachute, sees no measurements, and decides that altitude is zero.


You have a point, but I think this is a difference in the target markets. NASA can afford to have a thoroughly-tested, redundant, quickly-accessible failure path that restarts and reinitializes the process, and their processes are so thoroughly tested that they can expect their production instances not to have any software failures.

Pony is made for a more typical scenario where every failure case is a new, untested error path, and history shows that general developers handle errors badly. Pony makes a heroic effort to remove the error path from the language semantics entirely: every failure is explicitly annotated in the code, and has to be handled there. This is much better for the typical developer, allowing them to produce very robust code without NASA's degree of investment towards it. Giving Pony a special unwinding mode just for division by zero would be very dangerous, because this is a new path that only exists for a rarely-encountered sort of error; you cannot assume a sudden shutdown is going to do less damage.

Having Pony present the error cases inline on division by zero would certainly be safer, but there is a degree of disproportionality here: most divisions are impossible to go wrong in this way, and I suspect most of the rest are benign or would be quickly caught by the check after. Since Pony aims to compete against incumbents that are generally less safe in this regard (C++ has UB, Python has exceptions but makes it hard to handle them correctly, etc.), a compromise seems sensible.


It's not only NASA. Zero is just not a reasonable default value for an average. Imagine average credit card balance for FICO score, average blood pressure, average temperature in a freezer, etc.

UB is not the only way. You can have NULLs (they come with their own sets of problems, but still).

In many cases no value is better than incorrect value.


It's a value type, so you can't really have NULL. I don't see the issue with those examples; if you have no samples, there's nothing to mislabel. An average blood pressure of 0 across 0 patients is only going to harm 0 patients.


> I don't see the issue with those examples

Let me elaborate then.

Average credit card balance predicts probability of default. Joe who maxed out his credit card is higher risk than Jane who pays back her entire credit card balance every month. Now we have Jack with no credit card. We predict that Jack is low risk because his average credit card balance is zero.

Second example, defibrillator that monitors blood pressure and shocks the patient when his heart stops (blood pressure drops to zero). We attach the defibrillator to a patient, turn it on, and it shocks the patient immediately because there are no blood pressure measurements yet and therefore it thinks the average is zero.

Third example, a thermostat that turns on the freezer if average temperature is above a set threshold. It never turns on because average (of zero observations) is already zero.

Of course you can hard-code handling of those special cases, but if you did not, failing would be preferable to continuing to work incorrectly.


It seems to me taking averages would be inappropriate in all three of those scenarios.

Credit risk is based on total debt, and since credit cards are a revolving line of credit, the entire credit limit is considered debt, even if the balance is zero. That's why you can improve your credit score by closing out a credit card that you never use and that carries a zero balance (which we did in order to qualify for a mortgage several years ago). In other words, it's a sum, not an average; there's no division involved.

Setting aside the fact that there are much more reliable ways to detect a stopped heart than blood pressure: If you're taking a running average of BP over time, a low reading after a heart attack would just be a single data point, and you'd probably have to accumulate a bunch of them in order for it to register (particularly if you got a high reading just before the event). Even if you take a reading every five minutes, which is ridiculously frequent for BP, the patient will be long dead by the time you notice. You should be acting based on the last reading; again, no division involved.

I don't see why a freezer would be based on running averages rather than the most recent reading either. Thermostats generally work off of two thresholds: a higher one above which the compressor turns on, and a lower one below which it turns off. That smooths out any measurement noise without taking averages. Even if you do use averages for some reason, though, presumably it will start measuring at _some_ point and the compressor will turn on; I don't see why the average would stay at zero.

I'm not convinced that `1/0 = 0` is correct in any meaningful way, but I feel like any situation where it would cause bugs more critical than a UI issue probably points to a deeper design flaw. After all, if the alternative is to crash on a failed assertion, that's not necessarily preferable in a life-or-death situation.


Everything is relative. $1,000 balance is a lot of money for someone making $35,000, not so much for someone making $350,000. Many independent variables that go into risk scores tend to be ratios or averages, hence zero (or near zero) denominator is possible.

It is normally addressed with floors and ceilings or something like Laplace rule of succession, so that no data at all results in a reasonable number. Very rarely that reasonable number is zero.

I'm not saying taking the average is necessarily good in the examples given, just that if you are computing the average, then that computation should not silently return zero if there's no data to average.

That's the way AVG() in SQL or mean() in R work, they return NULL or NA rather than 0. If you know that NA should be 0 you can explicitly COALESCE, but that should be your decision rather than the default behavior.


I feel for the typical target market for Pony is going to be doing operations like that in floating point, or perhaps some explicit fixed-point non-integer type in financial cases. Whilst, yes, I can see this causing issues in some small subset of cases, the question really does come down to cost-benefit. Integer division by zero is not the only way for arithmetic to go wrong; would you expect every overflow to be checked too?


https://en.m.wikipedia.org/wiki/Credit_score_in_the_United_S...

Credit scores don't work this way. Zero is not a valid credit score, and the things you mentioned describe different components of the score.

I work a lot with survey data, to be interpreted by humans. In this problem domain, zero is a sane value for an empty average, though "N/A" is usually better. I wish our language would let us define it as zero so long reports don't die in the middle. But it's about your problem domain, as usual.


There are many different credit scores. The range of valid FICO scores does not include zero, but internal acquisition and behavior scores that banks use might.

But in this case it's irrelevant because average balance is independent variable (input of the scoring model).


You can check for 0 and throw in those cases, and then propagate the error path appropriately. And as they've already said in this thread, they will be providing a checked arithmetic in the standard library to do that for you.


Well, sometimes the average is 0, e.g. (-2 + 2)/2.

And the whole point is that this is meant to catch unforeseen interactions as soon as possible. If you add a check it's no longer unforeseen, and it may easily slip the programmer's mind.


If I'm not mistaken, jeremyjh was referring to checking for 0 in the denominator, not the numerator.


Pony requires programmer to handle everything case of every function...except this one case where it will just pick an arbitrary value to handle a case the progrmammer forgot, hiding the very error Pony was designed to guarantee gets solved. Why not allow partial functions everywhere and have them default to 0 where undefined?


That's a terrible default for an average function lol. Once again in nearly all cases, taking the average of a list without elements is an error.


What happens for 0 / 0? Is this also 0?


It's extra zero.


Is this the same as 0 * 1 / 0?


It returns a larger value of 0.


All zeroes are equal, but some are more equal than others.


ε


inf 0


I almost want two different division operations: One where 1/0 = 0, exclusively for use in progress bars and stuff like that, and another one for everything else.

Because frequently division by zero indicates a bug. But similarly frequently, I end up crapping out annoying little bits of code like

  if (foo == 0):
    return 0
  else:
    return bar / foo


I'm reading the "Pony" tweet quoted in TFA and your comment and I'm left very puzzled: is that really that common to want x / 0 == 0 ? In practice where does that crop up?

You say that you frequently have to write your little shim but honestly I don't remember writing code like that in recent memory.

You talk about progress bars, I suppose it makes sense if you somehow try to copy 0 elements for instance, and you end up dividing by zero when computing the progress, like:

    pos_percent = (copied_elems * 100) / total_elems
And both copied_elems and total_elems are zero. But in this case wouldn't you want the computation to return 100% instead of zero?

It's also a bit odd because it introduces a discontinuity: as the divisor goes towards zero the result of the operation gets greater until it reaches exactly zero and it yields 0. Wouldn't it make more sense to return INT_MAX (for integers) or +inf for floats? If you're writing a game for instance it might work better.

I guess it just goes to show that it probably makes a lot of sense to leave it undefined and let people write their own wrapper if necessary to do what works best for them in their given scenario.


Practically, it's quite common to not immediately know the divisor. In cases where the divisor is initially unknown but takes an imperceptible amount of time to compute it's better to render 0%. Otherwise you might get a flash of a full progress bar (for example) while the divisor is determined.

Of course, it's context dependent. As others mention, your code might be full of stuff like X / (divisor || 1).


The nicest UX would be to say something like "waiting..." while the denominator is zero. Presenting a zero to the user instead is probably an acceptable short-cut, but at the end of the day, it's a UX decision. Which is why it should be handled in explicit if-else code, that lives as close to the UI as possible, rather than being buried in the semantics of basic numerical operators.


If you dont know the number of elements, shouldnt that be tracked in another variable, instead of reusing the total elements variable and assuming 0 elements means unknown?

Also, if loading total elements takes a significant amount of time, shouldnt this loading also be reflected in the progess bar?


I’ve run into this a bit with progress related stuff. I bet if you looked at progress bar libraries they’d have similar logic built in.


    X / (divisor || 1).
It's invalid code in C, C++ and Java.


Boolean operators returning values is common, especially in dynamically typed languages. The code is valid in JavaScript, for example, and in Python (except it uses "or" instead of "||").


In python, it should probably be corrected with // or 1.0 to make it either an integer division or a float division.


Ah, that's true. Good point.


This is valid C and C++.


It is, but it will always give you a divisor of 1 (because "true" is 1 -- I also thought it would work until I tested it):

    % cat >division.c <<EOF
    #include <stdio.h>
    
    int main(void)
    {
            printf("1/0 = %d\n", 1 / (0 || 1));
            printf("1/2 = %d\n", 1 / (2 || 1));
    
            printf("1.0/0 = %f\n", 1.0 / (0 || 1));
            printf("1.0/2 = %f\n", 1.0 / (2 || 1));
    }
    % gcc -Wall -o divison divison.c
    % ./divison
    1/0 = 1
    1/2 = 1
    1.0/0 = 1.000000
    1.0/2 = 1.000000
In Python this is not the case:

    % python3
    >>> 1 / (0 or 1)
    1
    >>> 1. / (0 or 1)
    1.0
    >>> 1 / (0 or 1)
    1
    >>> 1. / (2 or 1)
    0.5
    % python3
    >>> 1 / (0 or 1)
    1.0
    >>> 1 / (2 or 1)
    0.5


Some trivia: GNU extensions to C/C++ include a "?:" operator that does what you'd want in this case, e.g.

    x / (divisor ?: 1)


It shouldn't be valid in C++. booleans cannot participate in arithmetic operations. You will get a warning from the compiler if you are lucky.

C doesn't have booleans and treat them as integers 0 or 1, it can do the math and will always return 1.


That's not true. C++ does define an implicit conversion from bool to int:

https://en.cppreference.com/w/cpp/language/implicit_conversi... (under "Integral promotion")

And C has a boolean type as of C99.


What do you expect for that C code?


In the context of this discussion, the idea was that it would act the same way as Python. But it obviously doesn't.


Real example: I'm collecting some quality signals from a corpus, most of which are some form of ratio, average, weighted average, or scaled average of counting various quantities within the documents. If the elements being counted are missing, I want the term involving that quality signal to disappear from the final ranking calculation.

Defining x / 0 = 0 gets this behavior for free, while leaving zero as an exception means it has to be caught for every single signal calculation, which is a pain when there are potentially dozens of different signals, all of which are counting different things (and have different divisors). I've actually defined a helper function to do this automatically, which also lets me change the default "null object" easily if I choose a different representation or want to apply some baseline value.


Defining a function would seem like exactly the right thing to do! There's no problem.


Module-scoped operator overloading would be really, really nice in this situation, though.


No, it probably wouldn't, unless you were using integers. Pony still uses floats and returns NaN or Infinity or something.


>It's also a bit odd because it introduces a discontinuity: as the divisor goes towards zero the result of the operation gets greater until it reaches exactly zero and it yields 0. Wouldn't it make more sense to return INT_MAX (for integers) or +inf for floats?

Unless you're using unsigned ints, the discontuity will exist no matter what you do, because of negative divisors.


Rust has your back:

pub fn checked_div(self, rhs: u8) -> Option<u8>

Checked integer division. Computes self / rhs, returning None if rhs == 0.

You would use this as lhs.checked_div(rhs).unwrap_or(your sane default)

This is dramatically better than always returning zero silently, as doing so is bound to be wrong in certain cases. If you run into a situation where you are afraid your RHS may be 0 but still want to do the right thing, this is what you'd use.


This is actually included in the IEEE754 floating point standard - there's a concept of "trapping" vs. "non-trapping" exceptions, where implementations are allowed to decide which exceptions should trap.

Unfortunately, GCC only lets you enable this globally (within a program): see http://www.gnu.org/software/libc/manual/html_node/FP-Excepti....


We will be introducing two sets of integer math operators in Pony. The current which are non-partial and can over/underflow + division by zero == 0 AND new ones that will be partial functions that cause an error on under/overflow and division by zero.


> new ones that will be partial functions that cause an error on under/overflow and division by zero.

A "strict" mode is usually always an afterthought once we have errors & people run it in strict mode and facepalm.

One of the first "utilities" I wrote for PHP was something called "pecl/scream", which turned off all the "unchecked" operations across the whole VM.

And in general, this works out poorly for code quality.


There is no golden path here. Checked exceptions for division would make a lot of Pony code objectively worse. Unchecked exceptions are great for PHP, Haskell, Rust or Go, but Pony is trying to do something different - to literally make it impossible to panic in an operation without describing that in the type system. The ergonomics of divide by zero in this context are absolutely debatable, not an issue to dismiss out-of-hand.


Well, except for kernel panics and hardware errors...


You can write user code in Pony that will cause a kernel panic or hardware error?


Yes. Hardware errors, can't really protect from that or a kernel panic. A kernel panic is a panic in non-pony code.


> You can write user code in Pony that will cause a kernel panic or hardware error?

You can always (as in OS with lazy allocation) allocate enough memory to get OOM no matter how safe your language is.


It may be forced to error as a result of one.


Pony is not 1.0 yet. The language is still malleable.


Couldn’t you alternatively introduce total operators that just grow the memory size as necessary instead of overflowing? For me if you’re handling wrapped types, I’d always prefer the numbers to grow into extra memory instead of overflowing or raising exceptions. Erlang, for instance, does this well. Obviously, division may still not be total, though you could define an operator like `//` that is.


A wrapped types library with the corresponding performance tradeoff is something I expect will be added to Pony at some point. There's definitely value in them for a variety of use cases.


Isn't a ternary here nicer?

  return (foo == 0) ? 0 : bar / foo
or even

  return bar / max(1, foo)
in the case where foo is integer or tiny foo would overflow your range anyway


I would say it doesn't really matter much as both are readable. I would prefer either the original if/else or the ternary solution over the last solution as I personally think the intent is not as immediately clear in your "max" form. I would optimize for readability over density, unless there is some reason to write it differently (e.g. performance).

just for fun, in kotlin (making use of expressions removes some verbosity):

    return where(den) { 
       0 -> 0 
       else -> num / den 
    }

    return if (den == 0) { 0 }  
           else { num / den }


Ternary doesn't really look any better. I'd do this in a language that supports it though:

  return foo && bar / foo


I've run into this a lot, but I'm also not sure what I really want to happen. Even for progress bars, the behavior can be different depending on how exactly you're obtaining the denominator -- does it increase dynamically or is it known from the beginning? Is it one of those fake progress bars that just ignores the fact that it'll stay at 100% for the next hour, or would it filling up genuinely mean that the work is actually done? After all, if you have 0 operations total and you've finished 0 of them, are you really 0% done? Wouldn't it make more sense to say you're 100% done? At the end of the day the user is trying to figure out how long they should keep waiting, to which the answer would be "you don't need to wait, there's no more work left to do".


> But similarly frequently, I end up crapping out annoying little bits of code

Isn't this reason the the point of having a utils library with commonly used pieces of code?


Some utilities libraries can be reflections of deficiencies in the language itself and not simply about common code reuse.


Agreed. Both really.


Most of the time when I have to think about what to do with a 0 divisor, I come to the conclusion that

  if (foo == 0):
    return MAX_FLOAT
  else:
    return bar / foo
is more sound for the given algorithm than "return 0" (still crappy, but more sound).

Then use MAX_FLOAT as an error flag instead of 0, which could have been the result of a legal operation (with bar==0).


Wait, when creating a progress Duke you divide in other direction: (things already done) / (all things). Can you give a specific example when it causes problems? My experience of writing code like that is that there a clearer way or I made a mistake somewhere.


Honest question. What is so annoying about that code? It's handling (in an application-specific manner) the special case of dividing by zero (which the `/` operator doesn't handle). I'm not sure of any other way to handle this.


Actually, bar/foo STILL has a gotcha which can throw exception, many developers doesn't think of this.

Just try INT_MIN/-1, can blow most "foolproof" divide logic away.


> But in this case, they are hurting their users far more than they are helping.

How could you possibly make an assertion like this without knowing why they made said choice[1]? Per the article, they didn't do it for fun, or to avoid exceptions (as you point out, more like defer exceptions). FTA:

> As I understand it, it’s because Pony forces you to handle all partial functions. Defining 1/0 is a “lesser evil” consequence of that.

Do you know enough about Pony's language design and decision process to support your claim that this decision (and all its second-order consequences) is hurting users far more than helping them?

[1] On the off chance that you are familiar with the tradeoffs Pony made and do know what you're talking about: care to clarify?


Hi I'm on the Pony core team and spent some time explaining the reasoning to Hillel. He hasn't written any Pony but I did walk him through how we ended up where we currently are.

I will be writing up a post about why we made this decision.


The real issue is that Pony doesn't have the concept of an unrecoverable error. All exceptions in Pony are checked, but sometimes if an error happens there's nothing your program can do and it should just crash.

Edit: Although Pony claims that programs should never crash, the language designers clearly understand the pragmatism of an unrecoverable error because it happens on OOM and stack overflow


It happens in scientific computing with zero bounded signals fairly regularly. Sometimes values hover close enough to zero that the computation results in a/0. It's not desired but the only alternative is a guard that invariably adds a performance penalty. There are other ways around it, such as normalization, but they're not trivial depending on the problem and can actually add other sources of bugs/errors. I understand the author's intent even though I disagree with the result.


1/a when a is close to zero is very large. 1/0 == 0 violates the limit and is just bad math.


Totally agree, in a mathematical proof you wouldn't ignore the edge case just because you think it might work the same way it has for some of your other proofs. It would invalidate everytihng, and in sequential processing, you could really mess up some of the data with one iteration with a 1/0 assumption that completely changes every iteration after it. Bad news.


What the world really needs: "1/0 = 0" in production build, "1/0 throws error" in development build...


That makes your bugs invisible in production. You could be failing every single transaction in the real world, and never know if your tests don't tickle the unhappy case.


Fairly straightforward in some languages (Ruby for instance).


The authors covers this at the end:

[1] The author has a dislike of the notion that 1/0 = 0 in a programming language. The author identifies this as a personal preference; I'm sure that preference is based on falsiable notions, but the author doesn't elaborate. The author _DOES_ elaborate their motivation for writing the post: The post debunks the notion that 1/0 = 0 is 'bad math'.

[2] Pony chose 1/0 = 0 because the language of Pony is set up to enforce the coder to complete all functions; if you go with the strict definition of division, which is an operation which has no definition if the second operand is 0, it's not complete anymore. I agree with your sentiment; this feels like the wrong answer, but it's definitely not as simple as just saying: Why don't the pony folks redefine things such that dividing by 0 raises some sort of error condition?


Does Pony have some analogous dodge for logarithms and square roots of reals, or for the gamma function?


Yep. Having had a couple divide-by-zero errors in a complicated simulation, I'm immensely grateful that it fails hard.


> If you have a program that's performing divide-by-zeroes, that's almost surely something you did not intend for

It's such a common case to consider, I don't think that's compelling. Knowing that a division by zero throws an exception or casts to zero, is similarly considered.

> they are hurting their users far more than they are helping.

There's no evidence of this.


Is this different than integer division or integer over/underflow? I mean, besides that the integer behavior is status quo and 1/0 is not. Genuine question.


Integer over/underflow is arguably less arbitrary. Sometimes Z mod 2^k is really where you want to be working, and the rest of the time at least it's the fastest answer. Whether that's enough of a justification to be a difference? shrug


OK but that doesn't have a lot to do with what TFA is actually talking about.


You're absolutely right. Not only that, but it's impossible to distinguish a bad result like 1/0 vs a good result like 0/1.

Even Javascript is better with "NaN"


To clarify, in Javascript 1/0 is Infinity, not NaN.


You are correct. But it really should be NaN, since 1/ε is positive infinity, whereas 1/-ε is negative infinity.

Oh well :)


JS just assumes the limit direction for you, so 1/0 is Infinity, but -1/0 is -Infinity. 0/0 at least is correctly NaN. (Edit: And I just verified against my memory, Matlab (or at least Octave) does the same thing. While Matlab might get characterized as being for the ivory tower, at least it's had a long history of being used for practical math applications within the tower. Edit2: And anyway this is the defined behavior for IEEE floats. Men of industry use industrial standards. :))


Moreover, 1/-0 is -Infinity. IEEE float has two zeros, positive and negative.


Nope, I'd disagree. Zero isn't an approximation of some epsilon, it's really just zero. It makes sense for the output sign to match the input sign.


What is the sign of 0?

It's 0.


Parent was talking about the sign of the numerator. But... zero in IEEE 754 representation has a meaningful sign bit. 1/-0 = -inf.


Or as IEEE 754 sees it (you can try this in your JavaScript REPL).

  >> 1 / 0
  Infinity
  >> 1 / -0
  -Infinity


e is nearly 0, -e is nearly -0.

So 1/0 is infinity, and 1/-0 is -infinity.


It’s common for 1/0 = 0 to be exactly what you want. Array average for example. That said raising exception would be the only consistent behavior if a language supports that.


It is not valid to extrapolate an average of 0 from no data at all.

For example, if we examine a sample of zero elephants, then we end up estimating the average elephantine mass as being zero.

This shows to be wildly off as soon as we upgrade our statistical wherewithal to work with a sample size of one.

A center of mass is a kind of average. If we have an empty object made of no particles of matter at all, can we arbitrarily pin its center of mass to the (0, 0, 0) origin of the coordinate system we are using?


This is probably beside the point, but shouldn't the center-of-mass of a massless object be everywhere simultaneously? With no reason to prefer any location, over any other?


The center of mass of a massless object makes as much as "everywhere" being a location of anything.


And in other cases you really do want the infinities. For example fast ray-AABB tests like https://tavianator.com/fast-branchless-raybounding-box-inter...


Common compared to what? Taking an average of an empty array is nonsensical.


I disagree. It’s similar to computing NaN-mean or NaN-sum for an array of all NaN values (which returns 0).

Let’s say some program calculates the mean of an array and then adds that mean to some accumulator.

For purposes of updating the accumulator, the mean of an empty array is perfectly well-defined: it should add nothing to the accumulator (add 0).

This might be a major operating requirement for the mean function, such that rather than guarding or pattern-matching on an empty array and handling a failure is far worse than having a better 0-length convention.

Consider the difference between the “head” function of some List type, which has to either raise an exception or wrap the return value in some Maybe structure, because it’s literally not definable, vs the “length” function which has an obvious natural definition for empty arrays that is often highly preferred to some design where length(empty_list) throws an exception and everyone has to handle it in little bits of custom code to specify 0.

To me this topic is all about usability and not about some parochial claim that some operation is nonsensical.


> It’s similar to computing NaN-mean or NaN-sum for an array of all NaN values (which returns 0).

Why would you want to do this rather than validating understanding of the data before computing on this?

> For purposes of updating the accumulator, the mean of an empty array is perfectly well-defined: it should add nothing to the acculator (add 0).

That's not a mean, though, that's a quirk of how you decide to (incorrectly) compute a mean. What's the point of such a program? It should refuse to compute when it's given no values--that's clearly a category error.

Otherwise, you're not using a mean, you're using a mean-or-0-when-lacking-a-mean. Might as well not call it a mean at all so other people can read your code.

Yes, this is pedantic, but this kind of subtle changing meaning of terms is exactly what leads to bugs. Name your functions accurately.


From your response I can tell you don’t do much numerical linear algebra work.

Consider needs to vectorize a large column-wise mean calculation across columns of a large data matrix (where NaN values are sparse but appreciable).

The NaNs might be perfectly reasonable, expected pieces of data, but you still want to understand the distribution of the non-NaN data, and adding extra work to filter it out first might be hugely costly, or even actually wrong depending on what other operations the NaN data is planned to be passed to and how those operations natively handle NaN.

And simple columnar summary stats are just the tip of the iceberg. It gets much more complicated.

By no means is the solution of “diagnose why there are NaNs ahead of time and preprocess them accordingly” even remotely realistic in most use cases. This is why libraries like pandas, numpy and scipy for instance provide specific nan-ignoring functions or function parameters.


That's all fine and dandy, but conceptually it's wrong to say the average of an array is 0, and can and will lead to wrong results in a variety of cases. I'm sure you can think of a lot of these cases yourself. I think in the history of computer science we programmers have found that there are a lot of convenience shortcuts that make sense in a lot of cases but bite our asses in other. Implicit is fast and fun, but it's nice to have your seatbelt on when the car crashes. Going back to the average case, if you want an average function that returns 0 on empty arrays, fine. But that's not the average function, and you shouldn't call it that way, and names matter, you should call it averageOrZero or something like that.


Why is it conceptually wrong to say the average of an empty array is zero? My undergrad degree is in pure math and my grad degree is in mathematical statistics and I’ve never heard an idea like saying the mean of an empty array is zero is “conceptually wrong.”

You bring up the history of CS, but even there you have debates about what convention to use for defining 1/0 for function totality and theorem provers.

There’s no aspect of pure math derivation of number systems on up through vector spaces that definitively makes a zero mean for an empty array ill-defined. Whatever choice you make, positive infinity, undefined, 0, or any finite values, etc., any such choice is purely down to convention that depends precisely on your use case.


> Why is it conceptually wrong to say the average of an empty array is zero?

It’s not conceptually wrong, it just means the “mean” you’re referring to calculates a different value than the “mean” we’re taught in school. So, underlying assumptions about the differences in “mean” should be communicated where it’s used.


Sure, I agree they should be communicated. Like, in the docs for “standard” mean functions, and not pushed into “specialized” mean functions, since needing this particular convention is not remotely special, and is rudimentary and expected in 99% of linear algebra and data analytics work, which are the largest drivers of these types of statistical functions.


> It’s common for 1/0 = 0 to be exactly what you want. Array average for example.

Wouldn’t that be a (potentially very different) case of 0/0? If there are 0 elements, you wouldn’t have a nonzero numerator, right?


The mean of an empty array isn't zero though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: