Mission

My purpose is to curate a living map of the World's knowledge.

My 生き甲斐 (ikagai) “reason to get up in the morning” is to facilitate responsible human achievement.

To see how far humanity can get when their needs are obviated and caring, ethical, responsible people are empowered to act with the fullest of their human potential.

Table of Contents

Intro

Curating a living map of the world's knowledge means contributing towards systems which make humanity more effective at pursuing answers to meaningful questions. It also necessitates that humanity should have tools to obtain a more holistic understanding of the questions we're asking (and the implications surrounding these questions and answers). It is inspired by Vannevar Bush's Memex and Tim Berners-Lee's Giant Global Graph.

A big part of this mission involves structuring and connecting related data in relevant ways. It also entails designing Artificial Intelligence algorithms capable of classifying data with (or without) varying levels of human assistance. Above all, it means building views and interfaces (i.e. maps) which enable people both to easily contribute knowledge (e.g. Wikipedia) and explore fields of knowledge.

Here is why I choose the words I did:

  1. Curate - There are millions of people creating new works without taking responsibility to adequately consider, research, and connect what exists. Instead of a one time investment by we the individuals, we often publish loosely and require that every reader repeat the same work of sifting through an increasingly unmanageable endless stream of redundant information. What's worse is this information is often unstructured and challenging to deduplicate. Having a lot of data, examples, and ideas is a great thing, so long as it doesn't overwhelm our ability to reason about them or lead to inconsistencies or ireconsilable differences. Consolidating, re-integrating, repurposing and polishing existing works is a great way to improve the value of a fields knowledge. (Recommended further reading: Research Debt. i.e. why learning is so hard and the low hanging fruit we're missing to fix this).
  2. Living - The universe is always changing. And as it does, there will paradoxically always be variables we cannot measure. Either because they are too far in the past, too distant in the present, or because they have yet to happen. re: Godel's incompleteness theorem and John McCarthy's Frame Problem, if a single equation of the world exists, we would only be able to heuristically approximate it with some high acuracy, never to prove it. One is unable to prove a system from within it because a proof of verification would paradoxically require keeping track of all the state of the system which by definition would require more bits than are in the system. Thus, we should always be adapting and revising our understanding
  3. Map - It's important that my life goal entails a map, rather than a graph. A map implies a goal of discovery and a cartography. A way to ground, connect and compare things. Provenance trails whose steps can be retraced, citing important landmarks along the path to meaningful conclusions. A map can be a graph, but a map fundamentally serve a purpose of navigation; it makes the impossible possible. It gives us a novel view of a landscape which may otherwise be too complex for us to conceptualize.
  4. World - I believe in universal access to academic information as a human right.
  5. Knowledge - I believe that knowledge are things which are methodologically rooted in axioms, whether this be math, science, or history. While enjoyable as entertainment, I don't consider religion, music, or art to be knowledge (outside of the aforementioned contexts of history, math, science, etc). I believe the greatest sources of knowledge are academic papers, books, and personal web pages + essays -- especially curations which save us all hours of work or those which present a novel interface which affords the reader a "super-power" like being able to reproduce the writer's steps (which we may not have been smart enough to figure out ourselves).

Disclaimer: Like one's financial investments, not all one's efforts are best spent directly on your life's mission. For instance, what's the point of creating a map of worldly knowledge if the earth is going to be hit by an asteroid during our life time? Or if we have tons of knowledge but no freedom of speech? Here's a list of some causes I support (both monetarily and through my volunteer efforts) in order to support my life mission. Included among these is Preventing Global and Existential Catastrophe.

Problem

One of humanity's most limiting factors is the speed by which we access the right information. Today there exists too great a volume of texts for one to manually process in their lifetime. It makes it commonplace to miss out in discovering a text which could change our life, or belief system, or understanding. This problem is exacerbated by unpredictable variable quality of these works and licensing which prevents curators from consolidation.

For me, the process of choosing a book is so daunting it's anxiety inducing. Reading takes a long time. There are only so many books we can read. And it's even more frustrating "meta-reading" (i.e. doing research just to figure out what you *should* be reading). It seems like with all the people in the world, we should have a much better idea about what the most important books to read are (and mechanisms to determine such information in a high quality way).

Example: Here's an example of the problem and the opportunity. There are approximately 29,681 books on Calculus on Amazon.com. How much of their content overlaps? How many of these books has an author read before they decide to write a new book on the topic? Which books contain the best or unique explanations? How can all these books be connected in a way which allows us to seamlessly switch from one to another, when one particular work isn't sufficient for our use case? For more thoughts on this topic, refer to, "On Books" and a deeper survey of the logistics of the problem.

Pictured below: Yasiv.com is an interface which attempts to simplify the problem of information overload on Amazon.com by connecting similar books which were purchased by the same people. Read also: From the YARIV blog, Storytelling with Data - Graph Analysis.

I believe we as a society have need for cartographers of knowledge; that structured, linked data, paired with domain expertise (e.g. wikipedia) and augmented with programatic intelligence and access, can help humanity and machines work together to identify and map* [shortest] paths through knowledge spaces. Each of these directed path(s) represent a digital curriculum. And each of these curriculum could be interconnected into a living, universal map of the world's knowledge. The inner contents of books, academic papers, youtube videos could be inserted (where appropriate) into this graph, enabling learners to seamlessly access the right resources (without the friction of having to check them out at a library) as they navigate each step of a curricula. These will be the books of the 22nd century, and we have barely begun to scratch the surface of creating its printing press or "amazon.com".

* When I say map, instead of meaning a "mapping" (in the mathematical sense, of one thing mapping to another) I instead literally mean a navigational map (e.g. a mercator map), which an early explorer painstakingly labored over. It's important to be able to calculate distance between objects on such maps, to identify obstructions along routes (like traffic or mountains), and to qualify which areas are unexplored.

A Universal Knowledge Map...

  1. Evolves: Is constantly evolving and perpetually incomplete.
  2. Aggregates: Serves as a hub or entity-resolved index, referencing existing efforts
  3. Is Axiomatic: Resolves entities down to their simplest parts (or describes infinitely small parts via forumae)
  4. Cardinality: Has a well defined coordinate system with next and previous operations

What?

What does a solution look like?

Imagine having a digital atlas of all of Mathematics. With the ability to see a heatmap of all the topics you understand, and the degree to which you understand them. Your personal GPS showing you where you are; your current state of knowledge, what you know, and what you don't know. Using your trackpad, you'd zoom semantically deeper into any subject area, just like google maps, learning more about the academic topology of each space... That combinatorics is a subfield of number theory, albeit a small one compared to number theory.

Now imagine zooming in through subject areas until encountering an unfamiliar equation or concept, and with the press of a key, your Map generates a direct, shortest path of self-contained directions, catered to you (based on information about your prior knowledge; e.g. Metacademy, Khan Academy and Arbital) on how to arrive at a undertanding of the principle.

Instead of getting map directions tailored to whether you are traveling by bike, walking, or public transit, you might choose to request directions for a brief walkthrough, a comprehensive example, or perhaps a basic overview is sufficient to address your use case. Now, instead of being limited to mathematics, imagine a Universal Knowledge Map (think of an atlas), which covers all scholarly knowledge and is constantly updated by the community and by algorithms, like Wikipedia.

Imagine if instead of holding a single book, you could use such a system to simultaneously navigate all books. At an instant you could discern which books had the best explanations, which sections others read, how and when readers jump from book to book, and in what sequence. It might look something like this. And what about the same thing for academic papers? What if instead of reading a paper, you could follow a self-contained, literate, expert curated curriculum which incorporates all dependencies you'd need to understand that work?

Such a platform would afford researchers of any skill level to holistically explore any field without wasting time guessing which information is accessible to them and in which order they should read. With a living map of the world's knowledge, anyone may visualize and inspect their hypotheses against any and all available relevant information; to determine its applicability, its consistency and correctness, and its potential impact. And if this is achievable, why should we settle for anything less?

Why?

Out of every problem in the world, why is this one worth dedicating one's life? I began with a list of first principles; criteria which I value:

  1. It's Forever*: a universally relevant, never-ending goal
  2. It's Compounding*: The more it's used and advanced, the more valuable it becomes.
  3. It's Universal / Ubiquitous: Instead of serving a few people, it creates opportunity and value for everyone.
  4. It's Transparent: It forces us to recognize and address inconsistencies in our reasoning and logic.
  5. It's Impactful: Millions of hours are wasted on trying to organize information
  6. It's Scalable: It can be scaled through technology and programatic interfaces.
  7. It's Open & Causa Scientiae: The mission is noble and serves the public's interest,
  8. It's Imperative: It allows us to analyze questions whose answers are critical to insuring the survival of the human race

*I like the idea of working on problems which are universal (will remain relevant and compound in interest over the course of 10,000 years). In my mind, this broadly includes mathematics and computer science, physics, chemistry, and biology.

Why specifically? I have an essay on facebook which describes the provenance of my coming to champion this mission. The simplest reason I can give is Universal (programmatic) Access to Knowledge is just. It leads us towards a world I would like to see. It's frustrating knowing all around the world people are duplicating the same inefficient preperatory steps organizing their research process and environment when none of it is material to the learning process. It's analogous to the frustration of searching for a book using a library card catalog versus typing in a url and retrieving a book instantly. We can do better. Let's find a way to allow more humans to spend their time productively. My hypothesis is the result will be:

Why wouldn't we want the option to consider all available and relevant information when attempting to explain a phenomenon? It allows us to [leverage technology to] reason about things outside of our limited scope of knowledge. I asked Jessy Exum, Dr. Richard Hamming's million dollar question, "What is the most important problem in the universe?".Jessy Exum answered, "I don't think that any single person can know the answer to that". I think this is one of the most compelling reasons that we should explore ways of sharing and uniting our knowledge. Because the answer to this question is imperative to the survival of the human race.

Here are some key principles this mission addresses:

  1. Reduce redundancy: Focus on curating and organizing content to help researchers more easily discover what exists.
  2. Unsilo: Develop collaborative environments to help researchers amplify their efforts.
  3. Reproducibility: Help researchers accurately reproduce and interface with scientific results.
  4. Reduce Misinformation: Improve the accuracy of discourse and reduce inconsistency and fallacy through real-time fact-checking.
  5. Encourage Discovery: Improve the depth of discourse via real-time augmentation.
  6. Collaboration: To curate resources together and to share paths through knowledge.
  7. Provenance: Retain an epistemological history of knowledge discovery (and methods)
  8. Persistence: Eliminate single points of failure

How?

How do we do it? What is entailed in curating a living map of the world's knowledge?

  1. Cultivate communities, protocols, and infrastructure to produce many central hubs and repositories of useful information, like Wikipedia, Github, StackExchange, Reddit, Quora, Facebook, et al.
  2. Exploring mechanisms and standards -- like IIIF (for images), XMPP (for chat), and OAI-PMH (for scholarly papers), to unite these disparate knowledge sources and to make their formats exchangable and interoperable, and consumable by both humans and computers.
  3. Creating pipelines which use these standards to harvest data from these disparate sources, producing a single coherent, universal, distributed collection of knowledge.
  4. Recursively decomposing this well formed data into more granular semantic units, such as RDF entity tags (as defined by Schema.org, and others). Well defined, inter-compatible, common denominator units of information (like definitions are to a dictionary).
  5. Designing algorithms capable of recursively decomposing the complex documents and data representations which we've deferred, such as academic papers and books, into these structured epistemological modules.
  6. Connecting these common units of information in useful ways. Dissolving arbitrary document boundaries and leveraging communities and algorithms to thoughtfully model arbitrary sequences of knowledge modules and refine the relationships and dependencies between, them over time (e.g. Freebase and Wikidata)
  7. Once we have these curated and normalized entity graphs, this means creating frameworks, protocols, and interfaces which enable the world to effectively navigate paths through acquiring knowledge as seamlessly as google maps.
  8. Leveraging analytic feedback loops and tests to evolve better nomenclature, explanations, and sequences over time.
  9. And to ensure long-term universal, unrestricted access of this knowledge ecosystem for the greatest number of beneficiaries, this means obviating any one party's ability to control the flow of information by developing + adopting decentralized information infrastructure to lock it open.

Applications

Here's a list of high impact sub-problems we may wish to explore as demos/examples:

Here a few demos of system which well embody these philosophies:

Requirements

This necessitates†:

Approach

Find 100 people with 100 year goals around ever-green topics (math, sciences, philosophy, medicine, history, human survival). Build prototypes which connect knowledge via graphs and unite their backends over the same database (graph.global)

Achievable?

Many citizens of the world (Vannevar Bush, Paul Otelet, Ted Nelson, Bertrand Russell and Alfred North Whitehead, Jimmy Wales, Euclid, Brewster Kahle, Larry Page & Sergey Brin, Salman Khan, and countless others) dedicated vast amounts of their life energy to improving accessibility of information. Some were so focused in their mission they became tunnel-visioned (e.g. the Principica Mathematica's violation of Godel's Incompleteness Theorem) and didn't spend sufficient time asking important questions as to the viability and consequences of their approaches.

Often when I tell people my goal, they classify me as a dreamer. And while I may be deserving of their cynicism, I'd like to think I make an active effort to dream with my eyes open. I readily concede my goals are more ambitious than I as an individual can expect to accomplish within my lifetime. But instead of sacrificing the scope of my goal, I instead aspire to make steady progress on manageable, solvable, and high impact sub-problems, to reach milestones of progress often, and for that which I can accomplish, do so well and with a clear trail of provenance that others may leverage after I am gone.

Towards this goal, I use spreadsheets to quantify my progress and actively surround myself with comrades who can give me constructive criticisms and who can guide me towards curbing my enthusiasm. I observe a similar philosophy as my mentor Aaron Swartz in his essay, "Productivity" (see my addition thoughts). I try to stand on the shoulders of giants and learn from their successes, as well as their mistakes. Some of my favorite lessons I've preserved here.

Many of these lessons have made me sensitive to many challenges and logistic limitations which may prevent me from realizing some elements of this mission (i.e. Cory Doctrow's, "Metacrap: Putting the torch to seven straw-men of the meta-utopia", John McCarthy's, "Frame Problem", Kurt Gödel's, "Incompleteness Theorem"). And while I try to be considerate and ground myself by these cautionary principles, I nonetheless subscribe to the philosophy that the, Incompleteness Theorem Doesn't Mean "Stop Trying" -- It just means, thoughtfully "Scope Your Objectives".

I think it's achievable to get to the point where human computation and (in the same vein) communication are the limiting factor to what we can learn, and not information retrieval and latency. That is, we will be able to instantly retrieve an answer to any question we can imagine, and the process by which we imagine and form questions will become our bottleneck. I believe this fundamental limitation of the human condition (which impacts how we think and communicate) will be addressed by extending out humanity with technology at a more intimate level (i.e. bci and more "direct" access).

Is this achievable in my life time? Probably not, which is why part of my life needs to be dedicated to the sustainability of the world, discovering the right people who are capable of making forward progress on these issues, cultivating the interest of others and explaining why these problems are so important, and documenting my learnings and ideas as carefully as I can.

Connecting Dots

How do all of my projects come together?

Incoherent Notes

Philosophy History

Principles & Philosophies

History

Perhaps you have a similar goal and would like to know what other work has been done towards its success. My path is informed by many great mathematicians, philosophers, scientists, and archivists who have been informing efforts for decades:

  1. Alfred Whitehead and Bertrand Russell authored the Principia Mathematica which was an important experiment and step in formalizing all of math. Except, their specific approach was doomed from conception. Similar to the roles of Marvin Minsky and Roger Schank in catalyzing the AI winter, Kurk Friedrich Gödel's Incompleteness Theorem (which, paraphrased, is similar John McMarthy's Frame Problem

Version History

  1. 2016-01-29 - Added facebook post 10102408017839810
  2. 2015-12-26 - Initial Release

Thanks

Thanks to the following folks who helped improve this essay