Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> To make a system sufficiently configurable, you end up having to re-invent a turing complete language inside it's configuration files...

Isn't source code basically a configuration file for the compiler/interpreter?

> because ultimately the config format can never be expressive enough to solve all the problems

And if so, I don't think this statement is quite true. Source code is expressive enough.

Configuration files give us flexibility but not expressiveness. Source code, as it exists today, gives us expressiveness but not flexibility[1].

This disparity should be a big red flag that we are doing something really wrong.

This disparity, in my opinion, is caused by the abstraction we use to communicate information between systems within source code: parameterized sub-routines.

A programming language that doesn't use parameterized sub-routines is basically a configuration file. This gives us both expressiveness and flexibility.

[1] Flexibility is within the context of config-vs-source code and ops post. Programming languages are very flexible when you know them.



Isn't source code basically a configuration file for the compiler/interpreter?

Source code is input, not configuration. Every time you run the compiler, you give it different source code. The whole point of the compiler is to transform source code.

Configuration is "input" that is not expected to change very often, but that the application developer can not guarantee will never change. Configuration falls into a few categories:

1. Context-specific inputs that can not be predicted or discovered automatically by the application. (For example, the DNS zones the process is expected to host - named.conf)

2. Inputs that are not expected to change very often, but should be easy to change if necessary. (ssh client or server settings)

3. For applications with interactive user interfaces, customization and changes to the default interface.

For the vast majority of applications, a text file capable of expressing simple data structures is more than sufficient. Others advocate using only a database, but only to make configuration dynamic so explicit reloads aren't necessary, not to make it more like a programming language.

The source-vs-configuration question is usually only be an issue when dealing with large, complex applications with very general specifications.


> Source code is input, not configuration. Every time you run the compiler, you give it different source code.

Interpreted languages (JIT) re-interpret the same source code every time they are run (though they don't have to).

What about monkey patching (http://en.wikipedia.org/wiki/Monkey_patch)? An example usage is altering behavior (the source code) based on running the code in a testing environment -vs- production (as one example).

How about dependency injection? How about lambda expressions?

What about best known practices like favor composition over inheritance? If we hard code a solution using inheritance that is input but if we build out a solution using composition that is a configuration?

These are all tools available to programmers that provide ways of changing behavior at the source code level even though that behavior may not change very often.

This is starting to look like a real gray area to me: input -vs- configuration.

I guess we can try and distinguish between the types of input a system consumes (based on the static nature of the input) but I don't know how useful that is as an abstraction.


Monkey patching is 3rd-party modification to the application itself; or specially modified inputs that simulate such a modification. It's not configuration.

Dependency injection is a software design pattern. There's no point calling it configuration.

A lambda expression, from a programming perspective, is just an anonymous function, a function so trivial it does not need an associated identifier.

    > If we hard code a solution using inheritance that is
    > input but if we build out a solution using composition 
    > that is a configuration?
No, a solution using composition is not configuration. Maybe it might help to think of the intended audience:

Configuration is intended for end-users or system administrators, NOT programmers. Configuration specifically refers to inputs that you remove from code and place somewhere that is (a) easily modified by anyone and (b) very hard to break.

All of the tools or techniques you list are targeted at programmers (and programmers doing programming, not configuring their text editor or IDE).

    > This is starting to look like a real gray area to me:
    > input -vs- configuration.
Of course it's a grey area. Configuration is ultimately a type of input. But there's still a semantic distinction to make, just like there's a semantic distinction between a desert and a grassland even though the border between the two isn't distinct.

Also, I should make it clear there's an assumption we're talking about application configuration, not configuration in the ITIL sense, where it has an extremely generic meaning.

    > I guess we can try and distinguish between the types
    > of input a system consumes (based on the static nature 
    > of the input) but I don't know how useful that is as
    > an abstraction.
Primarily, the distinction informs decisions about how access to various options and features are provided. Do you require the code itself to be modified? Do you make it a compile-time flag? Do you have the application load it from a default file in a standard location (like /etc/myapp.conf) or do you read the input from stdin? It's possible to have a solid understanding of where to put things without actually using the word "configuration" but why not just use the term since it is already there?


Code vs config is not supposed to be a grey area, but it quickly becomes one when you try to make the application so configurable that you're actually moving business logic to the config files. That's why that is an anti-pattern.


I upvoted your comment, but the reality is that the grey area is easy to find.

Consider apache's httpd.conf. It includes support for fairly advanced features like conditionals, scopes, and sub-configuration of 3rd party modules. In some environments, for example, it may be quite sensible to include some bits of what might technically be considered "business logic" in something like URL rewrites.

Consider DNS records. Not named.conf, but the actual zone data itself. Is that configuration or is it data? Do you check zone files into a configuration repository with your other files, or do you treat them more like a database to be modified on the fly and just back it up periodically? It probably depends on how dynamic the records are expected to be.

Consider emacs: it uses lisp as its configuration language.

On the other end of the scale, you have very small single-purpose scripts that are easily hand-editable. If you have a script that's no more than 1K, maybe you simply put an "options" section at the top with some defaults that can be changed by modifying the code directly. Or maybe your script is so small that even that amount of overhead is pointless.

It's important to remember that just because there is a grey area doesn't mean that just because you might stuck there, everyone is. But it doesn't mean it can't be confusing sometimes.


Yeah, at a certain point you cross over from configuration into scripting. Emacs Lisp is definitely in the scripting zone, on purpose.

Yes, Apache has a crazy amount of configuration in httpd.conf, and supports conditionals and scopes, to the point where they had to write a syntax checker for it. I would actually consider httpd.conf a good example of the "softcoding" anti-pattern.

A certain amount of configuration is good, and deliberately providing a scripting DSL language for your program is also good if appropriate.

But when you inadvertently cross over from normal configuration into absurdly complex configuration that resembles a crappy scripting language, giving you the feeling that you're in this big grey area between code and config, that's an anti-pattern.


What would be an alternative to parameterized subroutines, though? Is there another way to avoid re-implementing an algorithm every time you want to use it? And keeping all source in one gigantic file?


Think something like messages with behavior[1]. Think objects where behavior is implemented in properties: a single "makeItSo" property for example.

Every object has the exact same behavioral interface. The exact same behavioral interface means there is no specialization: every message looks the same. We can compose programs (behavior) by hooking up objects/messages as opposed to coding them. This is because we have 100% encapsulation[2]. We need to know nothing about the internal working of an object since it has no parameterized subroutines.

The abstraction for passing information between sub-systems are these messages (every object is a message) as opposed to parameterized subroutines.

[1] We could call it message-oriented programming (not to be confused with message-oriented software/frameworks).

[2] Even a single parameterized method leaks some of the internal workings of an object and leads to specialization of the objects interface. This also leads to tightly coupled software systems.


Aren't you basically just talking about Smalltalk/Obj-C style message-passing rather than method-calling? But perhaps with more complete encapsulation? That's a step in the right direction perhaps, but I don't really see how it obviates the need for interfaces--you still need to know what messages a particular object can respond to, don't you?


> you still need to know what messages a particular object can respond to, don't you?

Messages don't need to know what messages they can respond to as all messages have the same behavioral interface (the "makeItSo" property). Where messages expect certain informational properties (information interface) then, ya, the message would need to check if the message passed to it contains the information it requires (such as a UX layout engine that is waiting for UxControl messages).

If you have some time, here are some examples:

An example using addition:

    message Add (
      left 5
      right 6
    )
or it could be

    message Add (
      left Subtract ( left 5 right 6 )
      right 6
    )
or it could be:

    message Add (
      left FromFieldGet ( name "someForm" field "left" )
      right FromFieldGet ( name "someForm" field "right" )
    )
In all these examples, the Add message does not need to know anything about the messages provided to it in the left and right property (5 and 6 are actually messages themselves). The last Add example uses a message FromFieldGet that is able to locate information from a form (in this case named "someForm" with a field left and a field right. The form itself has not been defined in the add example but would also be defined as a message passed to some system that creates the UI/UX which would expected a message of type UxControl).

We could code the usage of our message like this:

   float result = message.asFloat; // the "makeItSo" as a float primitive
   int result = message.asInt; // the "makeItSo" as an integer primitive
   string result = message.asString; // the "makeItSo" as a string primitive
Or let's use Add in our configuration:

    message FormFieldSet (
      name "someForm"
      field "result"
      source RunAsFloat (
        part Add (
          left FromFieldGet ( name "someForm" field "left" )
          right FromFieldGet ( name "someForm" field "right" )
        )
      )
    )
Let's hook that message up to a ux "button"

    message Button (
      action FormFieldSet (
        name "someForm"
        field "result"
        source RunAsFloat (
          part Add (
            left FromFieldGet ( name "someForm" field "left" )
            right FromFieldGet ( name "someForm" field "right" )
          )
        )
      )
      // properties specific to the layout of a ux element on a form
    )
and so on.

Each message knows nothing about what behavior the messages composed in the properties of the message (100% decoupled) supports.

A message oriented language doesn't (can't) have constructs like for loops, switches, if/then/else, etc. in it as those aren't messages if they are integrated into the language. Instead, these are also viewed as messages (first class citizens).

A for each statement is also a message:

    message ForEach (
      start FromFieldGet ( name "someForm" field "start" )
      stop FromFieldGet ( name "someForm" field "stop" )
      action ...
    )
and so on.

If you look at Obj-C, it is really message passing implemented using traditional parameterized subroutines. Smalltalk is a lot more true to it's own nature but they still use methods as an abstraction for passing information between systems (For example, they will refer to #do as method as opposed to a message).

I think if Alan Kay was forced to implement Smalltalk by focusing 100% on messages (because he was told he couldn't use parameterized subroutines) we would have had a much different world today. In programming, mental models created through abstractions is everything.


I see, you're talking about a completely different paradigm from imperative languages. Thank you for the explanation and code examples.


> Thank you for the explanation and code examples.

Thank you for taking the time to look through them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: