We are constantly trying to write code that easier for a human to comprehend. These high-level languages gave us the power of expression. And we like this!
We want more expression, and safety in the code we write; but we also want code that humans can understand and reason about. So what if we re-think our current software stack? And have humans evaluate other human’s code.
carbn
is a Humans-as-a-Service, cutting edge technology that makes your code run in the distributed brains of our agents.
I find it curious that things like a “Natural Language Processing” (NLP) exist to bridge the gap between humans and computers.
One of the most predominant libraries to work NLP might be the Natural Language Toolkit. This is coded in python
. A language that is interpreted by a program that is written in c
, that it in turn is compiled to assembler
and then run in the computer chip.
In its purest form, we have myriad of abstractions piled up to get from:
Sorry I followed your minivan for thirty miles. I got caught up in the movie your kids were watching and wanted to see how it ends.
— Elizabeth Hackett (@LizHackett) June 6, 2018
to:
|| Tag | Confidence ||
|| --------- | ---------- ||
|| Relatable | 89% ||
The future. Now.
This is where carbn comes in. Let’s rewrite the sentiment analysis in carbn
:
(sentiment-analysis
(between (list 'Relatable 'NotRelatable 'FirstWorldProblems 'Undecidable))
(from (tweet (@ "LizHackett" 1004201775425966080))))
For those more Object oriented, this code is isomorphic to:
val model = SentimentAnalysis(
tags = listOf(RELATABLE, NOT_RELATABLE, FIRST_WORLD_PROBLEMS, UNDECIDABLE)
)
model.chooseOne()
model.process(Tweet("LizHackett" 1004201775425966080))
Or my preferred way Iambic pentameter
:
× / × / × / × /
when you do choose the sen∙ti∙ment
× / × / × /
from glo∙rious tweet by Liz*
× / × / × / × /
then clas∙si∙fy it please be∙tween
× / × / × / × / × / ×
re∙lat∙able, not re∙lat∙able, first world prob∙lems
× / × / × /
or just un∙de∙cid∙able
*Liz: @LizHackett/status/1004201775425966080
Even though it is a bit more verbose, it is an order of magnitude faster than the other two flavors. We’ll see why in the next section.
Software’s humble beginnings
Let’s start with a review of how we got here.
In the beginning, humans wrote machine code. A set of instructions that a computer would execute. We figured out that this was very powerful, but also very easy to make mistakes. We spend a lot of time thinking, and not a lot of time writing programs; as the economic balance was tipped in the machine’s favor.
With the hard work of many people, we started to be able to have more widespread computational access, cheaper. As humans would have it, we started writing more code. We soon understood that complex ideas needed a complex language. Something far beyond arithmetics and jumps, so we started the journey that started with compiled languages.
Code that was more abstract, but easier for a human to comprehend. These high-level languages gave us the power of expression, in lieu of raw power. And we liked this!
We roughly went from getting a value from an array like this:
movw r0, #:lower16:A
movt r0, #:upper16:A
; Address of ARRAY[3] = base addr + 3*1
ldrsb r2, [r0, #3]; Load ARRAY[3] into r2
ARRAY: .byte 11, 12, 13, 14, 15; byte typed initialized array
And this is a toy example, to keep things short. We hadn’t dealt with dynamic arrays, we know that we wanted the third position, so we could do some pre-emptive calculations on our head before writing this.
To a higher abstraction language like c
:
value = array[POSITION]
Not only it is less code to wrap our heads around, but we can be a bit more certain than before, that array
is somewhat array-like. It still might not have the POSITION
element; but given that we have the power of expression, we can add more code to cover this case, like so:
if(sizeof(array) > POSITION) panic()
value = array[POSITION]
Not content with this level of abstractions we have build many (maybe too many1) more layers of abstractions. We’ve reached a level2 where we can have the compiler itself check this, that we had to humanly-think
array : FixedSizeList '3 Int
value = array[POSITION]
This code will not compile if the POSITION
is greater than 3
, as the underlying list is of size 3
.
We’ve come so far! But I ponder that we can go further.
None of these approaches shield us from coding mistakes. What if we actually wanted the second item of a list, rather than the third? We didn’t have such a compiler… that is, until now.
From silicone to carbn
So we come up with a simple question:
Why did we build layer upon layer of abstraction?
We fully understand the problem to solve. After all, our team was taught in the classic silicon-transistor-Von Neumann way of thinking about software and code. We want more expression, and safety in the code we write; but we also want code that humans can understand and reason about.
We are stretching the problem space of the code in two very different directions:
- On the one hand, we want to eventually get our code to get transformed into machine instructions
- On the other, we care about expressiveness and correctness so that other humans can reason about the code we wrote.
So what if we cut the middle man? And have humans evaluate other human’s code.
How does it work
carbn
provides one major APIs to interact with the raw computing power of the platform: /please/eval
.
curl --location \
--request GET 'https://carbn.florius.com.ar/please/eval' \
--header 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ3b3ciOiJZb3UgcmVhbGx5IHRob3VnaHQgdGhpcyB3b3VsZCB5aWVsZCBzb21ldGhpbmcgb3RoZXIgdGhhbiBhIPCfkqk_In0.jmlA3QI58jcRcWoL8Kc8o8wEbVrN9mKXK6S46bea-Y4' \
--header 'Content-Language: en-shakespearean' \
--header 'Content-Type: text/plain' \
--data-raw 'do add thirteen to twelve, many thanks'
Which might return:
<carbn:computation xmlns:carbn="https://carbn.florius.com.ar/carbn.xsd">
<carbn:result>
<carbn:cubyhole>
a749765e-aa57-11eb-bcbc-0242ac130002
</carbn:cubyhole>
</carbn:result>
<carbn:success>
true
</carbn:success>
</carbn:computation>
This fires up two processes: One that quickly returns a UUID of the computation. This is like a cubbyhole for your result. And the other starts the orchestration, parsing, distribution, and eventual reduction of the result, which will at some point compute:
25
When the computation finished, you can retrieve the value by
curl --location \
--request GET 'https://carbn.florius.com.ar/please/give/a749765e-aa57-11eb-bcbc-0242ac130002'
Amazing!
Backend - Agents
On the other side of the communication, we have an army of agents that get a partial (but complete) piece of the request, that they can work with their incredibly powerful carbon-based brain.
This is what an agent might see for your request:
┏━━━━━┯━━━━━━━━━━━━━━━━━━━━━┓
┃ 1 ▲ │ Agent: J ┃
┃ 2 ▲ ├─────────────────────┨
┃ 3 ▲ │╔═══════════════════╗┃
┃ 4 │║× / × / × ║┃
┃ 5 │║do add thir∙teen to║┃
┃ 6 │║ ║┃
┃ 7 │║/ × / ×║┃
┃ │║twelve, many thanks║┃
┃ │╚═══════════════════╝┃
┠─────┼─────────────────────┨
┃ $5k │> 25 ┃
┗━━━━━┷━━━━━━━━━━━━━━━━━━━━━┛
As soon as they can type an answer, the cubbyhole will get filled.
Pricing
You will see in our pricing strategy that we are an order of magnitude more expensive than other cloud computing providers, we don’t have nearly as much up uptime as them, and our latency is much much higher, but can they:
- Find off-by-one errors and correct them on the spot?
- Accommodate a wide variety of languages with zero compile cost?
- Integrate perfectly with multiple non-artificial neural networks?
Time is cheap. Correctness and perfection are what you would be paying for with carbn
.
That and a lot of food for the agents. And sleeping accommodations. And healthcare. Some even get dental.
F.A.Q
Did you consider how similar this is to slavery?
Yes. But we are OK enslaving computers, why wouldn’t we be OK to do so with humans? How different are the two really?
Is the evaluation secure?
The initial payload is first split into chunks for better parallelization and scoping. We can’t reasonably give all the programs to just one agent, so we split each stack so that they can work in the smallest, complete definition of a sub-problem. So you can structure your code in such a way as to limit exposure to secrets from inner agents. We are currently implementing a tier of agents that is more trustworthy than the rest; so they can be in charge of the final reduction. So that one agent might compute a payload, another a URL, and a safe agent do the actual request with the proper secrets.
What is doing the orchestration, filling cubbyholes, responding through HTTP, et al.?
We are indeed using outdated technologies to do this plumbing.
We are working very hard to bootstrap carbn
so it can run on carbn
. For the time being, it is cheaper and more cost-effective to use unfashionable code.
This is satire.
-
Wikipedia :: List of programming languages with more than 3 programming languages. ↩
-
“… a dependent type is a type whose definition depends on a value” – Wikipedia :: Dependent type ↩