Mike Abegg | September 20, 2018
Free discussion on the internet is under more threat now than it has ever been, whether it be from governments or corporations. A new platform for public discussion that is resistant to single points of failure or control is quickly becoming essential to ensure free expression and public dialogue. To prevent there from being anyone in control of the platform and thus able to destroy its primary principle, a decentralized peer to peer discussion platform seems most workable. Mechanisms need to be built to allow completely open and community/individually driven organization, moderation, and curation. The emergence of blockchain shows that a distributed database can work for public dissemination of information, and a variation of blockchain might be the key for making this a reality.
We need a protocol. By looking at the resilience of other peer to peer networks like Bit Torrent[1], we can see that a diversity of clients leads to a healthy ecosystem[2]. Implementations can and should vary, even in their details of implementation and support for new features, but the core protocol must remain the same. The first question we should answer about our protocol is what should we call it? It is a protocol for discussions that are transmitted via distributed means, something like Distributed Discussion Relay Protocol (DDRP) seems like a good enough name for now.
What should DDRP look like? Text based protocols always have a readability (and thus maintainability) advantage, so the basic primitive of DDRP should be text encoded strings of some sort. JSON[3] is a strong well supported encoding language so it seems like a good first choice for encoding. We need to be able to send the various commands and data of the protocol in atomic components with a defined beginning and end, so our protocol will be frame based.
We want to be able to announce to some small subset of peers a message and have the network push that out to the rest of the world. And we want to get messages from the network. We want the protocol to be flexible enough to be able to deal optimally with a diverse range of networking environments. There are a number of low(ish) level protocol patterns but tieing our protocol down to any one of them might be a mistake and limit client options. DDRP should be transport agnostic and not overly concerned with if communication is happening via a RESTful API or over a publisher subscriber protocol like STOMP. These should be considered transport subprotocols (i.e. DDRP over http/RSS/email/carrier pigeon/etc).
DDRP is not concerned with how its frames move around (transport), but what the frames look like, what to do when they arrive, and what frames to send out in order to get other frames back.
Let us begin with the most essential components of our protocol, messages. Clients need to be able to post them and receive them. Clients should be able to query peers for messages in a given time range (to fetch old messages that were posted before the client connected to the network), and to subscribe to receive relayed message frames as they come in. Whenever a client receives a message it should forward it to all peers that are subscribed. Subscription should not be necessary to use DDRP, it should be possible to use it in a polling fashion, however this would be an implementation detail of the transport protocol. Some transport subprotocols might use persistent connections, other subprotocols might solve the same problem by establishing a request id and polling the id periodically for results.
Message frames should contain the actual text of the message, and also have additional message meta-data. Things like tags attached by the original poster at the time of submission, a (UTC) timestamp, information about the user who made the post, references to other messages, etc should all be included and be considered part of "the message".
Messages are not very useful in isolation. In order to have a discussion you need to be able to reference posted messages. But how would one make a straightforward reference to a message that might come in at any time and from anyone? There is no central database here to produce a globally unique auto-increment surrogate key[4]. How do we solve the problem of identity for messages?
We can take a lesson from the IPFS project and use hashes[5]. To keep things future proof, we should allow for different hash algorithms (something like multi-hashing[6]) and add that to the WHAT we store as the hash. To allow for collision resolution[3](as extremely remote as the possibility is[7]) we should allow multiple-hashing[8], so rather than just a simple hash we would have a hash chain, where each element in the chain is the serialized message + the previous hash. So a message’s identity becomes an ordered sequence (though typically only being one) of hash-type:hash value pairs. To prevent or at least mitigate problems of clients with disjoint sets of implementations of hashing algorithms we should allow multiple hash chains.
If the hash is part of the message frame this leads to a bit of a chicken and egg problem where if the hash is part of the message how can we hash the message? The easiest solution to this problem is to separate the message data and metadata from the message frame hashing/signing data. This way "the message" can remain the same even if clients want to add some additional metadata (perhaps signing the message, or adding different hash chain ids) of their own as they pass messages around and it will not impact the identity of the message.
This does bring up the question of what we are hashing and how we are hashing it. If the hash refers to a subset of the message frame, how do we hash it? There seem like two options. Because the frame is being passed around in a serialized fashion, we could extract only the hashed portions of the serialized frame and hash that. Alternatively we could define a stricter serialization than the JSON specification provides such that there is exactly one way to serialize a given frame, then serialize only the subset we are hashing and hash it. It seems like allowing the basically free-form text of a partial JSON deserialization would be both brittle and a vulnerability. We should probably go with the hash being of an especially strict JSON encoding of message data/metadata. Some of the attributes of "strict" might include things like disallowing of whitespace outside of data values, and defining the order of keys. There might be more needed, but these are a good start.
So we have solved the problem of identity for a message. A message frame will have an outer section which will include a precalculated set of identity hash chains, and perhaps some other metadata, and an inner section which will include all message data/metadata that is used in the calculation of the hash(chain(s)). So what can we use these references for?
A message thread is in its simplest form an ordered list of messages, each message being a cumulative response to all of the messages that came before it. In reality things tend to be a bit messier. Usually you will have a message that is a response to one other message somewhere in the thread and only that one message. Sometimes one message will be a response to multiple other messages. Even worse sometimes part of one message will be a response to one collection of messages, another part of the message will respond to another collection of messages and other parts of this same message will be in response to no other message in particular. One solution might be to break a message up into multiple subsections, and allow assigning of metadata to each of the sub sections. This would make for a very complex message frame, and perhaps no way to write a usable UI for it. Fortunately there already exists a tool for ‘marking up’ parts of a message, markup! We can adopt an existing markup language[9] (or a subset of one) and extend it, or just come up with our own from scratch.
If we use markup in the text of a message, we can define sections of our message that respond to one (or more) other messages, without complicating the message frame very much, the message frame can include a reference to all other messages by one (or all) of their hash chains in the message metadata, and the markup can just point to one of these references. This does introduce an issue with message references needing to be updated when a collision is detected, but assuming any message received before the later colliding message was referring to the old message and everything newer was referring to the new message will work well enough, with the option for clients to determine how these reference updates are performed. If under certain circumstances both ends of the collision end up getting referenced I think that would be acceptable given how astronomically rare a hash collision would be. (1 in 2256)
So now we have messages, and metadata with (at least) references to other messages. These two pieces of information allow us to construct a thread. Because the identity of messages are hashes which include all of their references this forms a tree like structure (technically a directed acyclic graph) which is very similar in ways to blockchain[10]. The key difference between the distributed ledger technology of blockchain and DDRP is that blockchain spends great expense trying to ensure that there is only one legitimate chain[11], and all forks eventually get discarded when every client reaches consensus. DDRP takes advantage of forks[12] to represent different subthreads of a discussion and even allows side threads to remerge (by a message responding to more than one other message). The most difficult problem to solve in blockchain is a valuable feature in DDRP.
We have solved the problem of identity for messages and used it to build discussion threads, but there is another issue of identity that needs to be addressed, that of the identity of users sending the messages. For any form of meaningful discussion to take place, some amount of identity is required. You need to know that the person responding to you was the person you addressed a message to earlier. But in many parts of the world having the wrong opinions can result in very bad real life consequences, so associating your online identity with your real world identity might not be the best idea. This is part of the reason why services like Twitter keep linking your identity on their service and your real life identity optional[13]. There is a well established middleground on the internet for solving this problem, that of pseudonymous identity[14]. In the context of our protocol all that is needed is a way to establish that the same person is making two or more different messages.
Cryptographic signing[15] is the key to this problem. Your identity boils down to a public key (included in message metadata) that can be used to verify a cryptographic signature (also included in the metadata) generated by a matching private key. Outside the core DDRP protocol, there is no need for anything more, and the public key is your pseudonym, and the private key is how you sign things. But given the way this is defined it is entirely possible to keep signing optional, so anonymous discussion can still be had (for those who want it), and clients can be configured to generate new identities for each thread, or each post. On the flip side, clients can be configured to ignore anonymous messages or messages from users that the client has not seen enough times before. It is left up to the users how their discussion will be had.
Having introduced personal identity we now have powerful a tool for other services. The first thing to notice is that the identity being a public/private key pair means that it can be shared by multiple users. Anyone who has the private key can act as the identity it represents anywhere in the world. The identity is not tied down to any authority, but this has a downside that if the private key is lost or stolen, there is no one to go to to reset it back to you.
Now that we have the ability to sign things and associate them with a user identity there are a number of new tools this opens up for us. The most obvious use for such a signature would be for publicly approving or disavowing a message. So let's delve into the world of moderation and curation. In many online communities today moderation and identification of disruption to meaningful discussion is an important feature. Some low moderation communities do exist, but no online community is entirely unmoderated. Without moderation it would be too easy to disrupt conversations simply by posting entire pages of the phone book repeatedly to every random message related to a subject you didn’t like (or everything if you didn’t like the platform).
This seemingly goes directly in opposition to the central goal of DDRP, a decentralized, uncontrolled, discussion platform. This would only be true if who does the moderation was established as an inscrutable authority. Let's look at the Reddit model of moderation[16]. On Reddit anyone can open a new subreddit, which is basically a somewhat self contained community, and who ever starts that community is its dictator and responsible for moderating it. It is a model that works well most of the time, except it’s not without its faults. It is inevitable that people who found communities sometimes turn against them, sometimes their hearts are in the right place but they are just bad at moderation, sometimes the community as a whole moves away from its moderation staff, and very frequently communities will balkanize into smaller warring sub-communities and one faction will gain control over moderation, stifling the other factions. All of these situations lead to someone gaining control over the discussion and using that power to silence people they do not like. How can moderation be done in a way that is less susceptible to these kinds of subversion.
In many of these situations if people in the community could simply choose to ignore certain moderators the problem would go away. The ability to subscribe and unsubscribe to a moderator, or a team of moderators who share a private key, should be DDRP's solution. Any user who has an identity would be able to mark a message as either approved or unapproved, and do the same for user identities. If you find that a moderator has been acting inappropriately you simply remove them from your moderation subscriptions.
The act of marking a message would be done in the form of its own data frame that associates a message hash ID with a moderation activity and a timestamp and the moderators signature, all of this would then get propagated through the network. Clients could permanently associate such moderation actions with the message allowing propagation of such moderation actions to spread as quickly as the message itself. But it might be prudent for clients to only do this with moderator identities that they have identified as being widely subscribed to, or to require peers to specify all moderators they are subscribed to. A non-hostile DDRP client should forward all moderation actions it receives, and as long as enough DDRP nodes do this then anyone who is subscribed to a given identity should receive such actions. If a moderator is established enough then they might even publish a public client node for subscribers to connect to to ensure they get that subscriber’s moderation actions.
Closely related to moderation is curation. This can be accomplished by tags. Initial tagging needs to be done by the original poster, but community driven tagging can be a powerful tool to harness in this situation. Tags can be applied by any person, similarly to how a moderation action can be applied by any person. And similar to moderation you should be able to subscribe to curators whose tags you consider trusted. There can be many different curation services, that might only focus on a single tag. And there might be a curation service that could offer many tags, but maybe you only want to subscribe to one of them. Tags applied by identities that fail you moderation subscriptions would not be included. Between moderation and curation subscription tag spam[17] should be a solvable problem.
Additional curation options could include community voting mechanisms, where a person can send out a signed up or down vote frame and it can then get associated with a message. Votes generated this way would however be public. Votes of identities who fail your moderation validation would not be included. This could allow moderation to help prevent the problem of brigading[18], where people outside of one community downvote messages in another.
A final mechanism of curation might include subscribing to personal curators to manually promote messages. This would work in a similar fashion to subscriptions or following in other contemporary social media, where you are notified whenever a person who you have subscribed to as a curator posts a message or promotes (think retweeting) someone else's message.
A combination of tagging, community action, and trusted subscription as organized by a client and configured by a user should solve the problem of curation in as much (or little) detail as any person could possibly want. Tags would form the basis of subjects, and thus (meta)communities, which themselves might be subdivided by moderation subscription.
Let's say you have posted a message and noticed a mistake after the fact. Let's say millions of people are now correcting you. It makes sense for a person to be able to retroactively clarify or reword something or correct misspellings. But it's dangerous to let people change history like that and incorrect to assume the edit will propagate to everyone in a super timely fashion. Can editing (or deleting) be allowed in a distributed peer to peer discussion platform and if so how?
The straightforward way seems like you should be able to post a corrected message that would supercede the original. But a simple replacement would leave people open to situations where someone says one thing then someone replies, the the original person edits their message to say something completely different. X could say "puppies are awesome", Y could then in response say "I agree", X could then edit their message to say "I hate puppies", now it seems like Y hates puppies, when they in fact meant to express their love for puppies. The message that Y responded to was the original, for this reason the original should always be maintained and in some way directly available. It should be up to the clients exactly how this should be presented, but the fact the Y likes puppies and X changed their message after the fact is what people should be able to see.
This means that while you can put out a retraction or update to what you say, your original message once posted exists forever (or as long as some peers keep the thread). But you should be able to update it, and new people looking at your post should see the new version by default. People responding from that point onward should be to your edited version, it should be clear which one they are responding to. So it seems the mechanism to edit a post should be to post another, corrected, message, in response to your original message, with a piece of metadata identifying it as an edit. A similar mechanism might allow moderators to edit messages as well, though their edits would only be available to people who have subscribed to them. Deleting could be accomplished by sending out a message frame with an edit tag and no message body, or perhaps an empty or null message body.
References: