Merge pull request #86 from lidel/feat/libp2p-key

Add support for libp2p-key multicodec to go-cid
Add libp2p-key multicodec
2019-05-27 16:46:10 -07:00 · 2019-05-28 01:40:54 +02:00 · 2019-05-13 10:54:14 -07:00 · 2019-05-10 09:41:01 -07:00 · 2019-05-06 16:22:17 -07:00 · 2019-02-28 18:32:58 +01:00
26 changed files with 1169 additions and 572 deletions
--- a/.gx/lastpubver
+++ b/.gx/lastpubver
@@ -1 +1 @@
-0.7.24: Qmdu2AYUV7yMoVBQPxXNfe7FJcdx16kYtsx6jAPKWQYF1y
+0.9.3: QmTbxNB1NwDesLmKTscr4udL2tVP7MaxvXnD1D9yX7g3PN
--- a/.travis.yml
+++ b/.travis.yml
@@ -1,24 +1,32 @@
-sudo: false
-
+os:
+  - linux

 language: go
+
 go:
-  - 'tip'
+  - 1.11.x

+env:
+  global:
+    - GOTFLAGS="-race"
+  matrix:
+    - BUILD_DEPTYPE=gx
+    - BUILD_DEPTYPE=gomod
+
+
+# disable travis install
 install:
-  - go get github.com/whyrusleeping/gx
-  - go get github.com/whyrusleeping/gx-go
-  - gx install --global
-script:
-  - gx test -v -race -coverprofile=coverage.txt -covermode=atomic .
+  - true
+
+script:
+  - bash <(curl -s https://raw.githubusercontent.com/ipfs/ci-helpers/master/travis-ci/run-standard-tests.sh)

-after_success:
-  - bash <(curl -s https://codecov.io/bash)

 cache:
  directories:
    - $GOPATH/src/gx
+    - $GOPATH/pkg/mod
+    - $HOME/.cache/go-build

 notifications:
-email: false
-  
+  email: false
--- a/README.md
+++ b/README.md
@@ -67,7 +67,7 @@ This will make sure that dependencies are rewritten to known working versions.

 ```go
 // Create a cid from a marshaled string
-c, err := cid.Decode("zdvgqEMYmNeH5fKciougvQcfzMcNjF3Z1tPouJ8C7pc3pe63k")
+c, err := cid.Decode("bafzbeigai3eoy2ccc7ybwjfz5r3rdxqrinwi4rwytly24tdbh6yk7zslrm")
 if err != nil {...}

 fmt.Println("Got CID: ", c)
--- a/_rsrch/cidiface/README.md
+++ b/_rsrch/cidiface/README.md
@@ -0,0 +1,168 @@
+What golang Kinds work best to implement CIDs?
+==============================================
+
+There are many possible ways to implement CIDs.  This package explores them.
+
+### criteria
+
+There's a couple different criteria to consider:
+
+- We want the best performance when operating on the type (getters, mostly);
+- We want to minimize the number of memory allocations we need;
+- We want types which can be used as map keys, because this is common.
+
+The priority of these criteria is open to argument, but it's probably
+mapkeys > minalloc > anythingelse.
+(Mapkeys and minalloc are also quite entangled, since if we don't pick a
+representation that can work natively as a map key, we'll end up needing
+a `KeyRepr()` method which gives us something that does work as a map key,
+an that will almost certainly involve a malloc itself.)
+
+### options
+
+There are quite a few different ways to go:
+
+- Option A: CIDs as a struct; multihash as bytes.
+- Option B: CIDs as a string.
+- Option C: CIDs as an interface with multiple implementors.
+- Option D: CIDs as a struct; multihash also as a struct or string.
+- Option E: CIDs as a struct; content as strings plus offsets.
+- Option F: CIDs as a struct wrapping only a string.
+
+The current approach on the master branch is Option A.
+
+Option D is distinctive from Option A because multihash as bytes transitively
+causes the CID struct to be non-comparible and thus not suitable for map keys
+as per https://golang.org/ref/spec#KeyType .  (It's also a bit more work to
+pursue Option D because it's just a bigger splash radius of change; but also,
+something we might also want to do soon, because we *do* also have these same
+map-key-usability concerns with multihash alone.)
+
+Option E is distinctive from Option D because Option E would always maintain
+the binary format of the cid internally, and so could yield it again without
+malloc, while still potentially having faster access to components than
+Option B since it wouldn't need to re-parse varints to access later fields.
+
+Option F is actually a varation of Option B; it's distinctive from the other
+struct options because it is proposing *literally* `struct{ x string }` as
+the type, with no additional fields for components nor offsets.
+
+Option C is the avoid-choices choice, but note that interfaces are not free;
+since "minimize mallocs" is one of our major goals, we cannot use interfaces
+whimsically.
+
+Note there is no proposal for migrating to `type Cid []bytes`, because that
+is generally considered to be strictly inferior to `type Cid string`.
+
+
+Discoveries
+-----------
+
+### using interfaces as map keys forgoes a lot of safety checks
+
+Using interfaces as map keys pushes a bunch of type checking to runtime.
+E.g., it's totally valid at compile time to push a type which is non-comparable
+into a map key; it will panic at *runtime* instead of failing at compile-time.
+
+There's also no way to define equality checks between implementors of the
+interface: golang will always use its innate concept of comparison for the
+concrete types.  This means its effectively *never safe* to use two different
+concrete implementations of an interface in the same map; you may add elements
+which are semantically "equal" in your mind, and end up very confused later
+when both impls of the same "equal" object have been stored.
+
+### sentinel values are possible in any impl, but some are clearer than others
+
+When using `*Cid`, the nil value is a clear sentinel for 'invalid';
+when using `type Cid string`, the zero value is a clear sentinel;
+when using `type Cid struct` per Option A or D... the only valid check is
+for a nil multihash field, since version=0 and codec=0 are both valid values.
+When using `type Cid struct{string}` per Option F, zero is a clear sentinel.
+
+### usability as a map key is important
+
+We already covered this in the criteria section, but for clarity:
+
+- Option A: ❌
+- Option B: ✔
+- Option C: ~ (caveats, and depends on concrete impl)
+- Option D: ✔
+- Option E: ✔
+- Option F: ✔
+
+### living without offsets requires parsing
+
+Since CID (and multihash!) are defined using varints, they require parsing;
+we can't just jump into the string at a known offset in order to yield e.g.
+the multicodec number.
+
+In order to get to the 'meat' of the CID (the multihash content), we first
+must parse:
+
+- the CID version varint;
+- the multicodec varint;
+- the multihash type enum varint;
+- and the multihash length varint.
+
+Since there are many applications where we want to jump straight to the
+multihash content (for example, when doing CAS sharding -- see the
+[disclaimer](https://github.com/multiformats/multihash#disclaimers) about
+bias in leading bytes), this overhead may be interesting.
+
+How much this overhead is significant is hard to say from microbenchmarking;
+it depends largely on usage patterns. If these traversals are a significant
+timesink, it would be an argument for Option D/E.
+If these traversals are *not* a significant timesink, we might be wiser
+to keep to Option B/F, because keeping a struct full of offsets will add several
+words of memory usage per CID, and we keep a *lot* of CIDs.
+
+### interfaces cause boxing which is a significant performance cost
+
+See `BenchmarkCidMap_CidStr` and friends.
+
+Long story short: using interfaces *anywhere* will cause the compiler to
+implicitly generate boxing and unboxing code (e.g. `runtime.convT2E`);
+this is both another function call, and more concerningly, results in
+large numbers of unbatchable memory allocations.
+
+Numbers without context are dangerous, but if you need one: 33%.
+It's a big deal.
+
+This means attempts to "use interfaces, but switch to concrete impls when
+performance is important" are a red herring: it doesn't work that way.
+
+This is not a general inditement against using interfaces -- but
+if a situation is at the scale where it's become important to mind whether
+or not pointers are a performance impact, then that situation also
+is one where you have to think twice before using interfaces.
+
+### struct wrappers can be used in place of typedefs with zero overhead
+
+See `TestSizeOf`.
+
+Using the `unsafe.Sizeof` feature to inspect what the Go runtime thinks,
+we can see that `type Foo string` and `type Foo struct{x string}` consume
+precisely the same amount of memory.
+
+This is interesting because it means we can choose between either
+type definition with no significant overhead anywhere we use it:
+thus, we can choose freely between Option B and Option F based on which
+we feel is more pleasant to work with.
+
+Option F (a struct wrapper) means we can prevent casting into our Cid type.
+Option B (typedef string) can be declared a `const`.
+Are there any other concerns that would separate the two choices?
+
+### one way or another: let's get rid of that star
+
+We should switch completely to handling `Cid` and remove `*Cid` completely.
+Regardless of whether we do this by migrating to interface, or string
+implementations, or simply structs with no pointers... once we get there,
+refactoring to any of the *others* can become a no-op from the perspective
+of any downstream code that uses CIDs.
+
+(This means all access via functions, never references to fields -- even if
+we were to use a struct implementation.  *Pretend* there's a interface,
+in other words.)
+
+There are probably `gofix` incantations which can help us with this migration.
--- a/_rsrch/cidiface/cid.go
+++ b/_rsrch/cidiface/cid.go
@@ -0,0 +1,48 @@
+package cid
+
+import (
+	mh "github.com/multiformats/go-multihash"
+)
+
+// Cid represents a self-describing content adressed identifier.
+//
+// A CID is composed of:
+//
+//   - a Version of the CID itself,
+//   - a Multicodec (indicates the encoding of the referenced content),
+//   - and a Multihash (which identifies the referenced content).
+//
+// (Note that the Multihash further contains its own version and hash type
+// indicators.)
+type Cid interface {
+	// n.b. 'yields' means "without copy", 'produces' means a malloc.
+
+	Version() uint64         // Yields the version prefix as a uint.
+	Multicodec() uint64      // Yields the multicodec as a uint.
+	Multihash() mh.Multihash // Yields the multihash segment.
+
+	String() string // Produces the CID formatted as b58 string.
+	Bytes() []byte  // Produces the CID formatted as raw binary.
+
+	Prefix() Prefix // Produces a tuple of non-content metadata.
+
+	// some change notes:
+	// - `KeyString() CidString` is gone because we're natively a map key now, you're welcome.
+	// - `StringOfBase(mbase.Encoding) (string, error)` is skipped, maybe it can come back but maybe it should be a formatter's job.
+	// - `Equals(o Cid) bool` is gone because it's now `==`, you're welcome.
+
+	// TODO: make a multi-return method for {v,mc,mh} decomposition.  CidStr will be able to implement this more efficiently than if one makes a series of the individual getter calls.
+}
+
+// Prefix represents all the metadata of a Cid,
+// that is, the Version, the Codec, the Multihash type
+// and the Multihash length. It does not contains
+// any actual content information.
+// NOTE: The use -1 in MhLength to mean default length is deprecated,
+//   use the V0Builder or V1Builder structures instead
+type Prefix struct {
+	Version  uint64
+	Codec    uint64
+	MhType   uint64
+	MhLength int
+}
--- a/_rsrch/cidiface/cidBoxingBench_test.go
+++ b/_rsrch/cidiface/cidBoxingBench_test.go
@@ -0,0 +1,71 @@
+package cid
+
+import (
+	"testing"
+)
+
+// BenchmarkCidMap_CidStr estimates how fast it is to insert primitives into a map
+// keyed by CidStr (concretely).
+//
+// We do 100 insertions per benchmark run to make sure the map initialization
+// doesn't dominate the results.
+//
+// Sample results on linux amd64 go1.11beta:
+//
+//   BenchmarkCidMap_CidStr-8          100000             16317 ns/op
+//   BenchmarkCidMap_CidIface-8        100000             20516 ns/op
+//
+// With benchmem on:
+//
+//   BenchmarkCidMap_CidStr-8          100000             15579 ns/op           11223 B/op        207 allocs/op
+//   BenchmarkCidMap_CidIface-8        100000             19500 ns/op           12824 B/op        307 allocs/op
+//   BenchmarkCidMap_StrPlusHax-8      200000             10451 ns/op            7589 B/op        202 allocs/op
+//
+// We can see here that the impact of interface boxing is significant:
+// it increases the time taken to do the inserts to 133%, largely because
+// the implied `runtime.convT2E` calls cause another malloc each.
+//
+// There are also significant allocations in both cases because
+// A) we cannot create a multihash without allocations since they are []byte;
+// B) the map has to be grown several times;
+// C) something I haven't quite put my finger on yet.
+// Ideally we'd drive those down further as well.
+//
+// Pre-allocating the map reduces allocs by a very small percentage by *count*,
+// but reduces the time taken by 66% overall (presumably because when a map
+// re-arranges itself, it involves more or less an O(n) copy of the content
+// in addition to the alloc itself).  This isn't topical to the question of
+// whether or not interfaces are a good idea; just for contextualizing.
+//
+func BenchmarkCidMap_CidStr(b *testing.B) {
+	for i := 0; i < b.N; i++ {
+		mp := map[CidStr]int{}
+		for x := 0; x < 100; x++ {
+			mp[NewCidStr(0, uint64(x), []byte{})] = x
+		}
+	}
+}
+
+// BenchmarkCidMap_CidIface is in the family of BenchmarkCidMap_CidStr:
+// it is identical except the map key type is declared as an interface
+// (which forces all insertions to be boxed, changing performance).
+func BenchmarkCidMap_CidIface(b *testing.B) {
+	for i := 0; i < b.N; i++ {
+		mp := map[Cid]int{}
+		for x := 0; x < 100; x++ {
+			mp[NewCidStr(0, uint64(x), []byte{})] = x
+		}
+	}
+}
+
+// BenchmarkCidMap_CidStrAvoidMapGrowth is in the family of BenchmarkCidMap_CidStr:
+// it is identical except the map is created with a size hint that removes
+// some allocations (5, in practice, apparently).
+func BenchmarkCidMap_CidStrAvoidMapGrowth(b *testing.B) {
+	for i := 0; i < b.N; i++ {
+		mp := make(map[CidStr]int, 100)
+		for x := 0; x < 100; x++ {
+			mp[NewCidStr(0, uint64(x), []byte{})] = x
+		}
+	}
+}
--- a/_rsrch/cidiface/cidString.go
+++ b/_rsrch/cidiface/cidString.go
@@ -0,0 +1,161 @@
+package cid
+
+import (
+	"encoding/binary"
+	"fmt"
+
+	mbase "github.com/multiformats/go-multibase"
+	mh "github.com/multiformats/go-multihash"
+)
+
+//=================
+// def & accessors
+//=================
+
+var _ Cid = CidStr("")
+var _ map[CidStr]struct{} = nil
+
+// CidStr is a representation of a Cid as a string type containing binary.
+//
+// Using golang's string type is preferable over byte slices even for binary
+// data because golang strings are immutable, usable as map keys,
+// trivially comparable with built-in equals operators, etc.
+//
+// Please do not cast strings or bytes into the CidStr type directly;
+// use a parse method which validates the data and yields a CidStr.
+type CidStr string
+
+// EmptyCidStr is a constant for a zero/uninitialized/sentinelvalue cid;
+// it is declared mainly for readability in checks for sentinel values.
+const EmptyCidStr = CidStr("")
+
+func (c CidStr) Version() uint64 {
+	bytes := []byte(c)
+	v, _ := binary.Uvarint(bytes)
+	return v
+}
+
+func (c CidStr) Multicodec() uint64 {
+	bytes := []byte(c)
+	_, n := binary.Uvarint(bytes) // skip version length
+	codec, _ := binary.Uvarint(bytes[n:])
+	return codec
+}
+
+func (c CidStr) Multihash() mh.Multihash {
+	bytes := []byte(c)
+	_, n1 := binary.Uvarint(bytes)      // skip version length
+	_, n2 := binary.Uvarint(bytes[n1:]) // skip codec length
+	return mh.Multihash(bytes[n1+n2:])  // return slice of remainder
+}
+
+// String returns the default string representation of a Cid.
+// Currently, Base58 is used as the encoding for the multibase string.
+func (c CidStr) String() string {
+	switch c.Version() {
+	case 0:
+		return c.Multihash().B58String()
+	case 1:
+		mbstr, err := mbase.Encode(mbase.Base58BTC, []byte(c))
+		if err != nil {
+			panic("should not error with hardcoded mbase: " + err.Error())
+		}
+		return mbstr
+	default:
+		panic("not possible to reach this point")
+	}
+}
+
+// Bytes produces a raw binary format of the CID.
+//
+// (For CidStr, this method is only distinct from casting because of
+// compatibility with v0 CIDs.)
+func (c CidStr) Bytes() []byte {
+	switch c.Version() {
+	case 0:
+		return c.Multihash()
+	case 1:
+		return []byte(c)
+	default:
+		panic("not possible to reach this point")
+	}
+}
+
+// Prefix builds and returns a Prefix out of a Cid.
+func (c CidStr) Prefix() Prefix {
+	dec, _ := mh.Decode(c.Multihash()) // assuming we got a valid multiaddr, this will not error
+	return Prefix{
+		MhType:   dec.Code,
+		MhLength: dec.Length,
+		Version:  c.Version(),
+		Codec:    c.Multicodec(),
+	}
+}
+
+//==================================
+// parsers & validators & factories
+//==================================
+
+func NewCidStr(version uint64, codecType uint64, mhash mh.Multihash) CidStr {
+	hashlen := len(mhash)
+	// two 8 bytes (max) numbers plus hash
+	buf := make([]byte, 2*binary.MaxVarintLen64+hashlen)
+	n := binary.PutUvarint(buf, version)
+	n += binary.PutUvarint(buf[n:], codecType)
+	cn := copy(buf[n:], mhash)
+	if cn != hashlen {
+		panic("copy hash length is inconsistent")
+	}
+	return CidStr(buf[:n+hashlen])
+}
+
+// CidStrParse takes a binary byte slice, parses it, and returns either
+// a valid CidStr, or the zero CidStr and an error.
+//
+// For CidV1, the data buffer is in the form:
+//
+//     <version><codec-type><multihash>
+//
+// CidV0 are also supported. In particular, data buffers starting
+// with length 34 bytes, which starts with bytes [18,32...] are considered
+// binary multihashes.
+//
+// The multicodec bytes are not parsed to verify they're a valid varint;
+// no further reification is performed.
+//
+// Multibase encoding should already have been unwrapped before parsing;
+// if you have a multibase-enveloped string, use CidStrDecode instead.
+//
+// CidStrParse is the inverse of Cid.Bytes().
+func CidStrParse(data []byte) (CidStr, error) {
+	if len(data) == 34 && data[0] == 18 && data[1] == 32 {
+		h, err := mh.Cast(data)
+		if err != nil {
+			return EmptyCidStr, err
+		}
+		return NewCidStr(0, DagProtobuf, h), nil
+	}
+
+	vers, n := binary.Uvarint(data)
+	if err := uvError(n); err != nil {
+		return EmptyCidStr, err
+	}
+
+	if vers != 0 && vers != 1 {
+		return EmptyCidStr, fmt.Errorf("invalid cid version number: %d", vers)
+	}
+
+	_, cn := binary.Uvarint(data[n:])
+	if err := uvError(cn); err != nil {
+		return EmptyCidStr, err
+	}
+
+	rest := data[n+cn:]
+	h, err := mh.Cast(rest)
+	if err != nil {
+		return EmptyCidStr, err
+	}
+
+	// REVIEW: if the data is longer than the mh.len expects, we silently ignore it?  should we?
+	return CidStr(data[0 : n+cn+len(h)]), nil
+}
--- a/_rsrch/cidiface/cidStruct.go
+++ b/_rsrch/cidiface/cidStruct.go
@@ -0,0 +1,164 @@
+package cid
+
+import (
+	"encoding/binary"
+	"fmt"
+
+	mbase "github.com/multiformats/go-multibase"
+	mh "github.com/multiformats/go-multihash"
+)
+
+//=================
+// def & accessors
+//=================
+
+var _ Cid = CidStruct{}
+
+//var _ map[CidStruct]struct{} = nil // Will not compile!  See struct def docs.
+//var _ map[Cid]struct{} = map[Cid]struct{}{CidStruct{}: struct{}{}} // Legal to compile...
+// but you'll get panics: "runtime error: hash of unhashable type cid.CidStruct"
+
+// CidStruct represents a CID in a struct format.
+//
+// This format complies with the exact same Cid interface as the CidStr
+// implementation, but completely pre-parses the Cid metadata.
+// CidStruct is a tad quicker in case of repeatedly accessed fields,
+// but requires more reshuffling to parse and to serialize.
+// CidStruct is not usable as a map key, because it contains a Multihash
+// reference, which is a slice, and thus not "comparable" as a primitive.
+//
+// Beware of zero-valued CidStruct: it is difficult to distinguish an
+// incorrectly-initialized "invalid" CidStruct from one representing a v0 cid.
+type CidStruct struct {
+	version uint64
+	codec   uint64
+	hash    mh.Multihash
+}
+
+// EmptyCidStruct is a constant for a zero/uninitialized/sentinelvalue cid;
+// it is declared mainly for readability in checks for sentinel values.
+//
+// Note: it's not actually a const; the compiler does not allow const structs.
+var EmptyCidStruct = CidStruct{}
+
+func (c CidStruct) Version() uint64 {
+	return c.version
+}
+
+func (c CidStruct) Multicodec() uint64 {
+	return c.codec
+}
+
+func (c CidStruct) Multihash() mh.Multihash {
+	return c.hash
+}
+
+// String returns the default string representation of a Cid.
+// Currently, Base58 is used as the encoding for the multibase string.
+func (c CidStruct) String() string {
+	switch c.Version() {
+	case 0:
+		return c.Multihash().B58String()
+	case 1:
+		mbstr, err := mbase.Encode(mbase.Base58BTC, c.Bytes())
+		if err != nil {
+			panic("should not error with hardcoded mbase: " + err.Error())
+		}
+		return mbstr
+	default:
+		panic("not possible to reach this point")
+	}
+}
+
+// Bytes produces a raw binary format of the CID.
+func (c CidStruct) Bytes() []byte {
+	switch c.version {
+	case 0:
+		return []byte(c.hash)
+	case 1:
+		// two 8 bytes (max) numbers plus hash
+		buf := make([]byte, 2*binary.MaxVarintLen64+len(c.hash))
+		n := binary.PutUvarint(buf, c.version)
+		n += binary.PutUvarint(buf[n:], c.codec)
+		cn := copy(buf[n:], c.hash)
+		if cn != len(c.hash) {
+			panic("copy hash length is inconsistent")
+		}
+		return buf[:n+len(c.hash)]
+	default:
+		panic("not possible to reach this point")
+	}
+}
+
+// Prefix builds and returns a Prefix out of a Cid.
+func (c CidStruct) Prefix() Prefix {
+	dec, _ := mh.Decode(c.hash) // assuming we got a valid multiaddr, this will not error
+	return Prefix{
+		MhType:   dec.Code,
+		MhLength: dec.Length,
+		Version:  c.version,
+		Codec:    c.codec,
+	}
+}
+
+//==================================
+// parsers & validators & factories
+//==================================
+
+// CidStructParse takes a binary byte slice, parses it, and returns either
+// a valid CidStruct, or the zero CidStruct and an error.
+//
+// For CidV1, the data buffer is in the form:
+//
+//     <version><codec-type><multihash>
+//
+// CidV0 are also supported. In particular, data buffers starting
+// with length 34 bytes, which starts with bytes [18,32...] are considered
+// binary multihashes.
+//
+// The multicodec bytes are not parsed to verify they're a valid varint;
+// no further reification is performed.
+//
+// Multibase encoding should already have been unwrapped before parsing;
+// if you have a multibase-enveloped string, use CidStructDecode instead.
+//
+// CidStructParse is the inverse of Cid.Bytes().
+func CidStructParse(data []byte) (CidStruct, error) {
+	if len(data) == 34 && data[0] == 18 && data[1] == 32 {
+		h, err := mh.Cast(data)
+		if err != nil {
+			return EmptyCidStruct, err
+		}
+		return CidStruct{
+			codec:   DagProtobuf,
+			version: 0,
+			hash:    h,
+		}, nil
+	}
+
+	vers, n := binary.Uvarint(data)
+	if err := uvError(n); err != nil {
+		return EmptyCidStruct, err
+	}
+
+	if vers != 0 && vers != 1 {
+		return EmptyCidStruct, fmt.Errorf("invalid cid version number: %d", vers)
+	}
+
+	codec, cn := binary.Uvarint(data[n:])
+	if err := uvError(cn); err != nil {
+		return EmptyCidStruct, err
+	}
+
+	rest := data[n+cn:]
+	h, err := mh.Cast(rest)
+	if err != nil {
+		return EmptyCidStruct, err
+	}
+
+	return CidStruct{
+		version: vers,
+		codec:   codec,
+		hash:    h,
+	}, nil
+}
--- a/_rsrch/cidiface/enums.go
+++ b/_rsrch/cidiface/enums.go
@@ -0,0 +1,79 @@
+package cid
+
+// These are multicodec-packed content types. The should match
+// the codes described in the authoritative document:
+// https://github.com/multiformats/multicodec/blob/master/table.csv
+const (
+	Raw = 0x55
+
+	DagProtobuf = 0x70
+	DagCBOR     = 0x71
+	Libp2pKey   = 0x72
+
+	GitRaw = 0x78
+
+	EthBlock           = 0x90
+	EthBlockList       = 0x91
+	EthTxTrie          = 0x92
+	EthTx              = 0x93
+	EthTxReceiptTrie   = 0x94
+	EthTxReceipt       = 0x95
+	EthStateTrie       = 0x96
+	EthAccountSnapshot = 0x97
+	EthStorageTrie     = 0x98
+	BitcoinBlock       = 0xb0
+	BitcoinTx          = 0xb1
+	ZcashBlock         = 0xc0
+	ZcashTx            = 0xc1
+	DecredBlock        = 0xe0
+	DecredTx           = 0xe1
+)
+
+// Codecs maps the name of a codec to its type
+var Codecs = map[string]uint64{
+	"v0":                   DagProtobuf,
+	"raw":                  Raw,
+	"protobuf":             DagProtobuf,
+	"cbor":                 DagCBOR,
+	"libp2p-key":           Libp2pKey,
+	"git-raw":              GitRaw,
+	"eth-block":            EthBlock,
+	"eth-block-list":       EthBlockList,
+	"eth-tx-trie":          EthTxTrie,
+	"eth-tx":               EthTx,
+	"eth-tx-receipt-trie":  EthTxReceiptTrie,
+	"eth-tx-receipt":       EthTxReceipt,
+	"eth-state-trie":       EthStateTrie,
+	"eth-account-snapshot": EthAccountSnapshot,
+	"eth-storage-trie":     EthStorageTrie,
+	"bitcoin-block":        BitcoinBlock,
+	"bitcoin-tx":           BitcoinTx,
+	"zcash-block":          ZcashBlock,
+	"zcash-tx":             ZcashTx,
+	"decred-block":         DecredBlock,
+	"decred-tx":            DecredTx,
+}
+
+// CodecToStr maps the numeric codec to its name
+var CodecToStr = map[uint64]string{
+	Raw:                "raw",
+	DagProtobuf:        "protobuf",
+	DagCBOR:            "cbor",
+	Libp2pKey:          "libp2p-key",
+	GitRaw:             "git-raw",
+	EthBlock:           "eth-block",
+	EthBlockList:       "eth-block-list",
+	EthTxTrie:          "eth-tx-trie",
+	EthTx:              "eth-tx",
+	EthTxReceiptTrie:   "eth-tx-receipt-trie",
+	EthTxReceipt:       "eth-tx-receipt",
+	EthStateTrie:       "eth-state-trie",
+	EthAccountSnapshot: "eth-account-snapshot",
+	EthStorageTrie:     "eth-storage-trie",
+	BitcoinBlock:       "bitcoin-block",
+	BitcoinTx:          "bitcoin-tx",
+	ZcashBlock:         "zcash-block",
+	ZcashTx:            "zcash-tx",
+	DecredBlock:        "decred-block",
+	DecredTx:           "decred-tx",
+}
--- a/_rsrch/cidiface/errors.go
+++ b/_rsrch/cidiface/errors.go
@@ -0,0 +1,24 @@
+package cid
+
+import (
+	"errors"
+)
+
+var (
+	// ErrVarintBuffSmall means that a buffer passed to the cid parser was not
+	// long enough, or did not contain an invalid cid
+	ErrVarintBuffSmall = errors.New("reading varint: buffer too small")
+
+	// ErrVarintTooBig means that the varint in the given cid was above the
+	// limit of 2^64
+	ErrVarintTooBig = errors.New("reading varint: varint bigger than 64bits" +
+		" and not supported")
+
+	// ErrCidTooShort means that the cid passed to decode was not long
+	// enough to be a valid Cid
+	ErrCidTooShort = errors.New("cid too short")
+
+	// ErrInvalidEncoding means that selected encoding is not supported
+	// by this Cid version
+	ErrInvalidEncoding = errors.New("invalid base encoding")
+)
--- a/_rsrch/cidiface/misc.go
+++ b/_rsrch/cidiface/misc.go
@@ -0,0 +1,12 @@
+package cid
+
+func uvError(read int) error {
+	switch {
+	case read == 0:
+		return ErrVarintBuffSmall
+	case read < 0:
+		return ErrVarintTooBig
+	default:
+		return nil
+	}
+}
--- a/builder.go
+++ b/builder.go
@@ -5,7 +5,7 @@ import (
 )

 type Builder interface {
-	Sum(data []byte) (*Cid, error)
+	Sum(data []byte) (Cid, error)
 	GetCodec() uint64
 	WithCodec(uint64) Builder
 }
@@ -33,10 +33,10 @@ func (p Prefix) WithCodec(c uint64) Builder {
 	return p
 }

-func (p V0Builder) Sum(data []byte) (*Cid, error) {
+func (p V0Builder) Sum(data []byte) (Cid, error) {
 	hash, err := mh.Sum(data, mh.SHA2_256, -1)
 	if err != nil {
-		return nil, err
+		return Undef, err
 	}
 	return NewCidV0(hash), nil
 }
@@ -52,14 +52,14 @@ func (p V0Builder) WithCodec(c uint64) Builder {
 	return V1Builder{Codec: c, MhType: mh.SHA2_256}
 }

-func (p V1Builder) Sum(data []byte) (*Cid, error) {
+func (p V1Builder) Sum(data []byte) (Cid, error) {
 	mhLen := p.MhLength
 	if mhLen <= 0 {
 		mhLen = -1
 	}
 	hash, err := mh.Sum(data, p.MhType, mhLen)
 	if err != nil {
-		return nil, err
+		return Undef, err
 	}
 	return NewCidV1(p.Codec, hash), nil
 }
--- a/cid-fmt/main.go
+++ b/cid-fmt/main.go
@@ -1,128 +0,0 @@
-package main
-
-import (
-	"fmt"
-	"os"
-	"strings"
-
-	c "github.com/ipfs/go-cid"
-
-	mb "github.com/multiformats/go-multibase"
-)
-
-func usage() {
-	fmt.Fprintf(os.Stderr, "usage: %s [-b multibase-code] [-v cid-version] <fmt-str> <cid> ...\n\n", os.Args[0])
-	fmt.Fprintf(os.Stderr, "<fmt-str> is either 'prefix' or a printf style format string:\n%s", c.FormatRef)
-	os.Exit(2)
-}
-
-func main() {
-	if len(os.Args) < 2 {
-		usage()
-	}
-	newBase := mb.Encoding(-1)
-	var verConv func(cid *c.Cid) (*c.Cid, error)
-	args := os.Args[1:]
-outer:
-	for {
-		switch args[0] {
-		case "-b":
-			if len(args) < 2 {
-				usage()
-			}
-			encoder, err := mb.EncoderByName(args[1])
-			if err != nil {
-				fmt.Fprintf(os.Stderr, "Error: %s\n", err.Error())
-				os.Exit(2)
-			}
-			newBase = encoder.Encoding()
-			args = args[2:]
-		case "-v":
-			if len(args) < 2 {
-				usage()
-			}
-			switch args[1] {
-			case "0":
-				verConv = toCidV0
-			case "1":
-				verConv = toCidV1
-			default:
-				fmt.Fprintf(os.Stderr, "Error: Invalid cid version: %s\n", args[1])
-				os.Exit(2)
-			}
-			args = args[2:]
-		default:
-			break outer
-		}
-	}
-	if len(args) < 2 {
-		usage()
-	}
-	fmtStr := args[0]
-	switch fmtStr {
-	case "prefix":
-		fmtStr = "%P"
-	default:
-		if strings.IndexByte(fmtStr, '%') == -1 {
-			fmt.Fprintf(os.Stderr, "Error: Invalid format string: %s\n", fmtStr)
-			os.Exit(2)
-		}
-	}
-	for _, cidStr := range args[1:] {
-		cid, err := c.Decode(cidStr)
-		if err != nil {
-			fmt.Fprintf(os.Stdout, "!INVALID_CID!\n")
-			errorMsg("%s: %v", cidStr, err)
-			// Don't abort on a bad cid
-			continue
-		}
-		base := newBase
-		if newBase == -1 {
-			base, _ = c.ExtractEncoding(cidStr)
-		}
-		if verConv != nil {
-			cid, err = verConv(cid)
-			if err != nil {
-				fmt.Fprintf(os.Stdout, "!ERROR!\n")
-				errorMsg("%s: %v", cidStr, err)
-				// Don't abort on a bad conversion
-				continue
-			}
-		}
-		str, err := c.Format(fmtStr, base, cid)
-		switch err.(type) {
-		case c.FormatStringError:
-			fmt.Fprintf(os.Stderr, "Error: %v\n", err)
-			os.Exit(2)
-		default:
-			fmt.Fprintf(os.Stdout, "!ERROR!\n")
-			errorMsg("%s: %v", cidStr, err)
-			// Don't abort on cid specific errors
-			continue
-		case nil:
-			// no error
-		}
-		fmt.Fprintf(os.Stdout, "%s\n", str)
-	}
-	os.Exit(exitCode)
-}
-
-var exitCode = 0
-
-func errorMsg(fmtStr string, a ...interface{}) {
-	fmt.Fprintf(os.Stderr, "Error: ")
-	fmt.Fprintf(os.Stderr, fmtStr, a...)
-	fmt.Fprintf(os.Stderr, "\n")
-	exitCode = 1
-}
-
-func toCidV0(cid *c.Cid) (*c.Cid, error) {
-	if cid.Type() != c.DagProtobuf {
-		return nil, fmt.Errorf("can't convert non-protobuf nodes to cidv0")
-	}
-	return c.NewCidV0(cid.Hash()), nil
-}
-
-func toCidV1(cid *c.Cid) (*c.Cid, error) {
-	return c.NewCidV1(cid.Type(), cid.Hash()), nil
-}
--- a/cid-fmt/main_test.go
+++ b/cid-fmt/main_test.go
@@ -1,45 +0,0 @@
-package main
-
-import (
-	"fmt"
-	"testing"
-
-	c "github.com/ipfs/go-cid"
-)
-
-func TestCidConv(t *testing.T) {
-	cidv0 := "QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn"
-	cidv1 := "zdj7WbTaiJT1fgatdet9Ei9iDB5hdCxkbVyhyh8YTUnXMiwYi"
-	cid, err := c.Decode(cidv0)
-	if err != nil {
-		t.Fatal(err)
-	}
-	cid, err = toCidV1(cid)
-	if err != nil {
-		t.Fatal(err)
-	}
-	if cid.String() != cidv1 {
-		t.Fatal("conversion failure")
-	}
-	cid, err = toCidV0(cid)
-	if err != nil {
-		t.Fatal(err)
-	}
-	cidStr := cid.String()
-	if cidStr != cidv0 {
-		t.Error(fmt.Sprintf("conversion failure, expected: %s; but got: %s", cidv0, cidStr))
-	}
-}
-
-func TestBadCidConv(t *testing.T) {
-	// this cid is a raw leaf and should not be able to convert to cidv0
-	cidv1 := "zb2rhhzX7uSKrtQ2ZZXFAabKiKFYZrJqKY2KE1cJ8yre2GSWZ"
-	cid, err := c.Decode(cidv1)
-	if err != nil {
-		t.Fatal(err)
-	}
-	cid, err = toCidV0(cid)
-	if err == nil {
-		t.Fatal("expected failure")
-	}
-}
--- a/cid.go
+++ b/cid.go
@@ -21,6 +21,7 @@ package cid

 import (
 	"bytes"
+	"encoding"
 	"encoding/binary"
 	"encoding/json"
 	"errors"
@@ -61,6 +62,7 @@ const (

 	DagProtobuf = 0x70
 	DagCBOR     = 0x71
+	Libp2pKey   = 0x72

 	GitRaw = 0x78

@@ -79,6 +81,8 @@ const (
 	ZcashTx            = 0xc1
 	DecredBlock        = 0xe0
 	DecredTx           = 0xe1
+	DashBlock          = 0xf0
+	DashTx             = 0xf1
 )

 // Codecs maps the name of a codec to its type
@@ -87,6 +91,7 @@ var Codecs = map[string]uint64{
 	"raw":                  Raw,
 	"protobuf":             DagProtobuf,
 	"cbor":                 DagCBOR,
+	"libp2p-key":           Libp2pKey,
 	"git-raw":              GitRaw,
 	"eth-block":            EthBlock,
 	"eth-block-list":       EthBlockList,
@@ -103,6 +108,8 @@ var Codecs = map[string]uint64{
 	"zcash-tx":             ZcashTx,
 	"decred-block":         DecredBlock,
 	"decred-tx":            DecredTx,
+	"dash-block":           DashBlock,
+	"dash-tx":              DashTx,
 }

 // CodecToStr maps the numeric codec to its name
@@ -126,42 +133,67 @@ var CodecToStr = map[uint64]string{
 	ZcashTx:            "zcash-tx",
 	DecredBlock:        "decred-block",
 	DecredTx:           "decred-tx",
+	DashBlock:          "dash-block",
+	DashTx:             "dash-tx",
 }

 // NewCidV0 returns a Cid-wrapped multihash.
 // They exist to allow IPFS to work with Cids while keeping
 // compatibility with the plain-multihash format used used in IPFS.
 // NewCidV1 should be used preferentially.
-func NewCidV0(mhash mh.Multihash) *Cid {
-	return &Cid{
-		version: 0,
-		codec:   DagProtobuf,
-		hash:    mhash,
+func NewCidV0(mhash mh.Multihash) Cid {
+	// Need to make sure hash is valid for CidV0 otherwise we will
+	// incorrectly detect it as CidV1 in the Version() method
+	dec, err := mh.Decode(mhash)
+	if err != nil {
+		panic(err)
 	}
+	if dec.Code != mh.SHA2_256 || dec.Length != 32 {
+		panic("invalid hash for cidv0")
+	}
+	return Cid{string(mhash)}
 }

 // NewCidV1 returns a new Cid using the given multicodec-packed
 // content type.
-func NewCidV1(codecType uint64, mhash mh.Multihash) *Cid {
-	return &Cid{
-		version: 1,
-		codec:   codecType,
-		hash:    mhash,
+func NewCidV1(codecType uint64, mhash mh.Multihash) Cid {
+	hashlen := len(mhash)
+	// two 8 bytes (max) numbers plus hash
+	buf := make([]byte, 2*binary.MaxVarintLen64+hashlen)
+	n := binary.PutUvarint(buf, 1)
+	n += binary.PutUvarint(buf[n:], codecType)
+	cn := copy(buf[n:], mhash)
+	if cn != hashlen {
+		panic("copy hash length is inconsistent")
 	}
+
+	return Cid{string(buf[:n+hashlen])}
 }

-// Cid represents a self-describing content adressed
+var _ encoding.BinaryMarshaler = Cid{}
+var _ encoding.BinaryUnmarshaler = (*Cid)(nil)
+var _ encoding.TextMarshaler = Cid{}
+var _ encoding.TextUnmarshaler = (*Cid)(nil)
+
+// Cid represents a self-describing content addressed
 // identifier. It is formed by a Version, a Codec (which indicates
 // a multicodec-packed content type) and a Multihash.
-type Cid struct {
-	version uint64
-	codec   uint64
-	hash    mh.Multihash
+type Cid struct{ str string }
+
+// Undef can be used to represent a nil or undefined Cid, using Cid{}
+// directly is also acceptable.
+var Undef = Cid{}
+
+// Defined returns true if a Cid is defined
+// Calling any other methods on an undefined Cid will result in
+// undefined behavior.
+func (c Cid) Defined() bool {
+	return c.str != ""
 }

 // Parse is a short-hand function to perform Decode, Cast etc... on
 // a generic interface{} type.
-func Parse(v interface{}) (*Cid, error) {
+func Parse(v interface{}) (Cid, error) {
 	switch v2 := v.(type) {
 	case string:
 		if strings.Contains(v2, "/ipfs/") {
@@ -172,10 +204,10 @@ func Parse(v interface{}) (*Cid, error) {
 		return Cast(v2)
 	case mh.Multihash:
 		return NewCidV0(v2), nil
-	case *Cid:
+	case Cid:
 		return v2, nil
 	default:
-		return nil, fmt.Errorf("can't parse %+v as Cid", v2)
+		return Undef, fmt.Errorf("can't parse %+v as Cid", v2)
 	}
 }

@@ -191,15 +223,15 @@ func Parse(v interface{}) (*Cid, error) {
 // Decode will also detect and parse CidV0 strings. Strings
 // starting with "Qm" are considered CidV0 and treated directly
 // as B58-encoded multihashes.
-func Decode(v string) (*Cid, error) {
+func Decode(v string) (Cid, error) {
 	if len(v) < 2 {
-		return nil, ErrCidTooShort
+		return Undef, ErrCidTooShort
 	}

 	if len(v) == 46 && v[:2] == "Qm" {
 		hash, err := mh.FromB58String(v)
 		if err != nil {
-			return nil, err
+			return Undef, err
 		}

 		return NewCidV0(hash), nil
@@ -207,7 +239,7 @@ func Decode(v string) (*Cid, error) {

 	_, data, err := mbase.Decode(v)
 	if err != nil {
-		return nil, err
+		return Undef, err
 	}

 	return Cast(data)
@@ -257,61 +289,88 @@ func uvError(read int) error {
 //
 // Please use decode when parsing a regular Cid string, as Cast does not
 // expect multibase-encoded data. Cast accepts the output of Cid.Bytes().
-func Cast(data []byte) (*Cid, error) {
+func Cast(data []byte) (Cid, error) {
 	if len(data) == 34 && data[0] == 18 && data[1] == 32 {
 		h, err := mh.Cast(data)
 		if err != nil {
-			return nil, err
+			return Undef, err
 		}

-		return &Cid{
-			codec:   DagProtobuf,
-			version: 0,
-			hash:    h,
-		}, nil
+		return NewCidV0(h), nil
 	}

 	vers, n := binary.Uvarint(data)
 	if err := uvError(n); err != nil {
-		return nil, err
+		return Undef, err
 	}

-	if vers != 0 && vers != 1 {
-		return nil, fmt.Errorf("invalid cid version number: %d", vers)
+	if vers != 1 {
+		return Undef, fmt.Errorf("expected 1 as the cid version number, got: %d", vers)
 	}

-	codec, cn := binary.Uvarint(data[n:])
+	_, cn := binary.Uvarint(data[n:])
 	if err := uvError(cn); err != nil {
-		return nil, err
+		return Undef, err
 	}

 	rest := data[n+cn:]
 	h, err := mh.Cast(rest)
 	if err != nil {
-		return nil, err
+		return Undef, err
 	}

-	return &Cid{
-		version: vers,
-		codec:   codec,
-		hash:    h,
-	}, nil
+	return Cid{string(data[0 : n+cn+len(h)])}, nil
+}
+
+// UnmarshalBinary is equivalent to Cast(). It implements the
+// encoding.BinaryUnmarshaler interface.
+func (c *Cid) UnmarshalBinary(data []byte) error {
+	casted, err := Cast(data)
+	if err != nil {
+		return err
+	}
+	c.str = casted.str
+	return nil
+}
+
+// UnmarshalText is equivalent to Decode(). It implements the
+// encoding.TextUnmarshaler interface.
+func (c *Cid) UnmarshalText(text []byte) error {
+	decodedCid, err := Decode(string(text))
+	if err != nil {
+		return err
+	}
+	c.str = decodedCid.str
+	return nil
+}
+
+// Version returns the Cid version.
+func (c Cid) Version() uint64 {
+	if len(c.str) == 34 && c.str[0] == 18 && c.str[1] == 32 {
+		return 0
+	}
+	return 1
 }

 // Type returns the multicodec-packed content type of a Cid.
-func (c *Cid) Type() uint64 {
-	return c.codec
+func (c Cid) Type() uint64 {
+	if c.Version() == 0 {
+		return DagProtobuf
+	}
+	_, n := uvarint(c.str)
+	codec, _ := uvarint(c.str[n:])
+	return codec
 }

 // String returns the default string representation of a
 // Cid. Currently, Base58 is used as the encoding for the
 // multibase string.
-func (c *Cid) String() string {
-	switch c.version {
+func (c Cid) String() string {
+	switch c.Version() {
 	case 0:
-		return c.hash.B58String()
+		return c.Hash().B58String()
 	case 1:
-		mbstr, err := mbase.Encode(mbase.Base58BTC, c.bytesV1())
+		mbstr, err := mbase.Encode(mbase.Base32, c.Bytes())
 		if err != nil {
 			panic("should not error with hardcoded mbase: " + err.Error())
 		}
@@ -324,63 +383,74 @@ func (c *Cid) String() string {

 // String returns the string representation of a Cid
 // encoded is selected base
-func (c *Cid) StringOfBase(base mbase.Encoding) (string, error) {
-	switch c.version {
+func (c Cid) StringOfBase(base mbase.Encoding) (string, error) {
+	switch c.Version() {
 	case 0:
 		if base != mbase.Base58BTC {
 			return "", ErrInvalidEncoding
 		}
-		return c.hash.B58String(), nil
+		return c.Hash().B58String(), nil
 	case 1:
-		return mbase.Encode(base, c.bytesV1())
+		return mbase.Encode(base, c.Bytes())
+	default:
+		panic("not possible to reach this point")
+	}
+}
+
+// Encode return the string representation of a Cid in a given base
+// when applicable.  Version 0 Cid's are always in Base58 as they do
+// not take a multibase prefix.
+func (c Cid) Encode(base mbase.Encoder) string {
+	switch c.Version() {
+	case 0:
+		return c.Hash().B58String()
+	case 1:
+		return base.Encode(c.Bytes())
 	default:
 		panic("not possible to reach this point")
 	}
 }

 // Hash returns the multihash contained by a Cid.
-func (c *Cid) Hash() mh.Multihash {
-	return c.hash
+func (c Cid) Hash() mh.Multihash {
+	bytes := c.Bytes()
+
+	if c.Version() == 0 {
+		return mh.Multihash(bytes)
+	}
+
+	// skip version length
+	_, n1 := binary.Uvarint(bytes)
+	// skip codec length
+	_, n2 := binary.Uvarint(bytes[n1:])
+
+	return mh.Multihash(bytes[n1+n2:])
 }

 // Bytes returns the byte representation of a Cid.
 // The output of bytes can be parsed back into a Cid
 // with Cast().
-func (c *Cid) Bytes() []byte {
-	switch c.version {
-	case 0:
-		return c.bytesV0()
-	case 1:
-		return c.bytesV1()
-	default:
-		panic("not possible to reach this point")
-	}
+func (c Cid) Bytes() []byte {
+	return []byte(c.str)
 }

-func (c *Cid) bytesV0() []byte {
-	return []byte(c.hash)
+// MarshalBinary is equivalent to Bytes(). It implements the
+// encoding.BinaryMarshaler interface.
+func (c Cid) MarshalBinary() ([]byte, error) {
+	return c.Bytes(), nil
 }

-func (c *Cid) bytesV1() []byte {
-	// two 8 bytes (max) numbers plus hash
-	buf := make([]byte, 2*binary.MaxVarintLen64+len(c.hash))
-	n := binary.PutUvarint(buf, c.version)
-	n += binary.PutUvarint(buf[n:], c.codec)
-	cn := copy(buf[n:], c.hash)
-	if cn != len(c.hash) {
-		panic("copy hash length is inconsistent")
-	}
-
-	return buf[:n+len(c.hash)]
+// MarshalText is equivalent to String(). It implements the
+// encoding.TextMarshaler interface.
+func (c Cid) MarshalText() ([]byte, error) {
+	return []byte(c.String()), nil
 }

 // Equals checks that two Cids are the same.
 // In order for two Cids to be considered equal, the
 // Version, the Codec and the Multihash must match.
-func (c *Cid) Equals(o *Cid) bool {
-	return c.codec == o.codec &&
-		c.version == o.version &&
-		bytes.Equal(c.hash, o.hash)
+func (c Cid) Equals(o Cid) bool {
+	return c == o
 }

 // UnmarshalJSON parses the JSON representation of a Cid.
@@ -391,10 +461,15 @@ func (c *Cid) UnmarshalJSON(b []byte) error {
 	obj := struct {
 		CidTarget string `json:"/"`
 	}{}
-	err := json.Unmarshal(b, &obj)
+	objptr := &obj
+	err := json.Unmarshal(b, &objptr)
 	if err != nil {
 		return err
 	}
+	if objptr == nil {
+		*c = Cid{}
+		return nil
+	}

 	if obj.CidTarget == "" {
 		return fmt.Errorf("cid was incorrectly formatted")
@@ -405,9 +480,8 @@ func (c *Cid) UnmarshalJSON(b []byte) error {
 		return err
 	}

-	c.version = out.version
-	c.hash = out.hash
-	c.codec = out.codec
+	*c = out
+
 	return nil
 }

@@ -418,30 +492,33 @@ func (c *Cid) UnmarshalJSON(b []byte) error {
 // Note that this formatting comes from the IPLD specification
 // (https://github.com/ipld/specs/tree/master/ipld)
 func (c Cid) MarshalJSON() ([]byte, error) {
+	if !c.Defined() {
+		return []byte("null"), nil
+	}
 	return []byte(fmt.Sprintf("{\"/\":\"%s\"}", c.String())), nil
 }

-// KeyString casts the result of cid.Bytes() as a string, and returns it.
-func (c *Cid) KeyString() string {
-	return string(c.Bytes())
+// KeyString returns the binary representation of the Cid as a string
+func (c Cid) KeyString() string {
+	return c.str
 }

 // Loggable returns a Loggable (as defined by
 // https://godoc.org/github.com/ipfs/go-log).
-func (c *Cid) Loggable() map[string]interface{} {
+func (c Cid) Loggable() map[string]interface{} {
 	return map[string]interface{}{
 		"cid": c,
 	}
 }

 // Prefix builds and returns a Prefix out of a Cid.
-func (c *Cid) Prefix() Prefix {
-	dec, _ := mh.Decode(c.hash) // assuming we got a valid multiaddr, this will not error
+func (c Cid) Prefix() Prefix {
+	dec, _ := mh.Decode(c.Hash()) // assuming we got a valid multiaddr, this will not error
 	return Prefix{
 		MhType:   dec.Code,
 		MhLength: dec.Length,
-		Version:  c.version,
-		Codec:    c.codec,
+		Version:  c.Version(),
+		Codec:    c.Type(),
 	}
 }

@@ -460,10 +537,15 @@ type Prefix struct {

 // Sum uses the information in a prefix to perform a multihash.Sum()
 // and return a newly constructed Cid with the resulting multihash.
-func (p Prefix) Sum(data []byte) (*Cid, error) {
-	hash, err := mh.Sum(data, p.MhType, p.MhLength)
+func (p Prefix) Sum(data []byte) (Cid, error) {
+	length := p.MhLength
+	if p.MhType == mh.ID {
+		length = -1
+	}
+
+	hash, err := mh.Sum(data, p.MhType, length)
 	if err != nil {
-		return nil, err
+		return Undef, err
 	}

 	switch p.Version {
@@ -472,7 +554,7 @@ func (p Prefix) Sum(data []byte) (*Cid, error) {
 	case 1:
 		return NewCidV1(p.Codec, hash), nil
 	default:
-		return nil, fmt.Errorf("invalid cid version")
+		return Undef, fmt.Errorf("invalid cid version")
 	}
 }

--- a/cid_fuzz.go
+++ b/cid_fuzz.go
@@ -23,7 +23,7 @@ func Fuzz(data []byte) int {
 	if err != nil {
 		panic(err.Error())
 	}
-	cid2 := &Cid{}
+	cid2 := Cid{}
 	err = cid2.UnmarshalJSON(json)
 	if err != nil {
 		panic(err.Error())
--- a/cid_test.go
+++ b/cid_test.go
@@ -19,6 +19,7 @@ var tCodecs = map[uint64]string{
 	Raw:                "raw",
 	DagProtobuf:        "protobuf",
 	DagCBOR:            "cbor",
+	Libp2pKey:          "libp2p-key",
 	GitRaw:             "git-raw",
 	EthBlock:           "eth-block",
 	EthBlockList:       "eth-block-list",
@@ -35,18 +36,20 @@ var tCodecs = map[uint64]string{
 	ZcashTx:            "zcash-tx",
 	DecredBlock:        "decred-block",
 	DecredTx:           "decred-tx",
+	DashBlock:          "dash-block",
+	DashTx:             "dash-tx",
 }

-func assertEqual(t *testing.T, a, b *Cid) {
-	if a.codec != b.codec {
+func assertEqual(t *testing.T, a, b Cid) {
+	if a.Type() != b.Type() {
 		t.Fatal("mismatch on type")
 	}

-	if a.version != b.version {
+	if a.Version() != b.Version() {
 		t.Fatal("mismatch on version")
 	}

-	if !bytes.Equal(a.hash, b.hash) {
+	if !bytes.Equal(a.Hash(), b.Hash()) {
 		t.Fatal("multihash mismatch")
 	}
 }
@@ -71,17 +74,41 @@ func TestTableForV0(t *testing.T) {
 	}
 }

+func TestPrefixSum(t *testing.T) {
+	// Test creating CIDs both manually and with Prefix.
+	// Tests: https://github.com/ipfs/go-cid/issues/83
+	for _, hashfun := range []uint64{
+		mh.ID, mh.SHA3, mh.SHA2_256,
+	} {
+		h1, err := mh.Sum([]byte("TEST"), hashfun, -1)
+		if err != nil {
+			t.Fatal(err)
+		}
+		c1 := NewCidV1(Raw, h1)
+
+		h2, err := mh.Sum([]byte("foobar"), hashfun, -1)
+		if err != nil {
+			t.Fatal(err)
+		}
+		c2 := NewCidV1(Raw, h2)
+
+		c3, err := c1.Prefix().Sum([]byte("foobar"))
+		if err != nil {
+			t.Fatal(err)
+		}
+		if !c2.Equals(c3) {
+			t.Fatal("expected CIDs to be equal")
+		}
+	}
+}
+
 func TestBasicMarshaling(t *testing.T) {
 	h, err := mh.Sum([]byte("TEST"), mh.SHA3, 4)
 	if err != nil {
 		t.Fatal(err)
 	}

-	cid := &Cid{
-		codec:   7,
-		version: 1,
-		hash:    h,
-	}
+	cid := NewCidV1(7, h)

 	data := cid.Bytes()

@@ -107,11 +134,7 @@ func TestBasesMarshaling(t *testing.T) {
 		t.Fatal(err)
 	}

-	cid := &Cid{
-		codec:   7,
-		version: 1,
-		hash:    h,
-	}
+	cid := NewCidV1(7, h)

 	data := cid.Bytes()

@@ -152,6 +175,53 @@ func TestBasesMarshaling(t *testing.T) {
 		}

 		assertEqual(t, cid, out2)
+
+		encoder, err := mbase.NewEncoder(b)
+		if err != nil {
+			t.Fatal(err)
+		}
+		s2 := cid.Encode(encoder)
+		if s != s2 {
+			t.Fatalf("'%s' != '%s'", s, s2)
+		}
+	}
+}
+
+func TestBinaryMarshaling(t *testing.T) {
+	data := []byte("this is some test content")
+	hash, _ := mh.Sum(data, mh.SHA2_256, -1)
+	c := NewCidV1(DagCBOR, hash)
+	var c2 Cid
+
+	data, err := c.MarshalBinary()
+	if err != nil {
+		t.Fatal(err)
+	}
+	err = c2.UnmarshalBinary(data)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if !c.Equals(c2) {
+		t.Errorf("cids should be the same: %s %s", c, c2)
+	}
+}
+
+func TestTextMarshaling(t *testing.T) {
+	data := []byte("this is some test content")
+	hash, _ := mh.Sum(data, mh.SHA2_256, -1)
+	c := NewCidV1(DagCBOR, hash)
+	var c2 Cid
+
+	data, err := c.MarshalText()
+	if err != nil {
+		t.Fatal(err)
+	}
+	err = c2.UnmarshalText(data)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if !c.Equals(c2) {
+		t.Errorf("cids should be the same: %s %s", c, c2)
 	}
 }

@@ -170,17 +240,33 @@ func TestV0Handling(t *testing.T) {
 		t.Fatal(err)
 	}

-	if cid.version != 0 {
+	if cid.Version() != 0 {
 		t.Fatal("should have gotten version 0 cid")
 	}

-	if cid.hash.B58String() != old {
-		t.Fatal("marshaling roundtrip failed")
+	if cid.Hash().B58String() != old {
+		t.Fatalf("marshaling roundtrip failed: %s != %s", cid.Hash().B58String(), old)
 	}

 	if cid.String() != old {
 		t.Fatal("marshaling roundtrip failed")
 	}
+
+	new, err := cid.StringOfBase(mbase.Base58BTC)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if new != old {
+		t.Fatal("StringOfBase roundtrip failed")
+	}
+
+	encoder, err := mbase.NewEncoder(mbase.Base58BTC)
+	if err != nil {
+		t.Fatal(err)
+	}
+	if cid.Encode(encoder) != old {
+		t.Fatal("Encode roundtrip failed")
+	}
 }

 func TestV0ErrorCases(t *testing.T) {
@@ -281,9 +367,7 @@ func TestPrefixRoundtrip(t *testing.T) {
 func Test16BytesVarint(t *testing.T) {
 	data := []byte("this is some test content")
 	hash, _ := mh.Sum(data, mh.SHA2_256, -1)
-	c := NewCidV1(DagCBOR, hash)
-
-	c.codec = 1 << 63
+	c := NewCidV1(1<<63, hash)
 	_ = c.Bytes()
 }

@@ -326,8 +410,8 @@ func TestParse(t *testing.T) {
 		if err != nil {
 			return err
 		}
-		if cid.version != 0 {
-			return fmt.Errorf("expected version 0, got %s", string(cid.version))
+		if cid.Version() != 0 {
+			return fmt.Errorf("expected version 0, got %s", string(cid.Version()))
 		}
 		actual := cid.Hash().B58String()
 		if actual != expected {
@@ -355,13 +439,13 @@ func TestHexDecode(t *testing.T) {
 		t.Fatal(err)
 	}

-	if c.String() != "zb2rhhFAEMepUBbGyP1k8tGfz7BSciKXP6GHuUeUsJBaK6cqG" {
+	if c.String() != "bafkreie5qrjvaw64n4tjm6hbnm7fnqvcssfed4whsjqxzslbd3jwhsk3mm" {
 		t.Fatal("hash value failed to round trip decoding from hex")
 	}
 }

 func ExampleDecode() {
-	encoded := "zb2rhhFAEMepUBbGyP1k8tGfz7BSciKXP6GHuUeUsJBaK6cqG"
+	encoded := "bafkreie5qrjvaw64n4tjm6hbnm7fnqvcssfed4whsjqxzslbd3jwhsk3mm"
 	c, err := Decode(encoded)
 	if err != nil {
 		fmt.Printf("Error: %s", err)
@@ -369,11 +453,11 @@ func ExampleDecode() {
 	}

 	fmt.Println(c)
-	// Output: zb2rhhFAEMepUBbGyP1k8tGfz7BSciKXP6GHuUeUsJBaK6cqG
+	// Output: bafkreie5qrjvaw64n4tjm6hbnm7fnqvcssfed4whsjqxzslbd3jwhsk3mm
 }

 func TestFromJson(t *testing.T) {
-	cval := "zb2rhhFAEMepUBbGyP1k8tGfz7BSciKXP6GHuUeUsJBaK6cqG"
+	cval := "bafkreie5qrjvaw64n4tjm6hbnm7fnqvcssfed4whsjqxzslbd3jwhsk3mm"
 	jsoncid := []byte(`{"/":"` + cval + `"}`)
 	var c Cid
 	err := json.Unmarshal(jsoncid, &c)
@@ -387,7 +471,7 @@ func TestFromJson(t *testing.T) {
 }

 func TestJsonRoundTrip(t *testing.T) {
-	exp, err := Decode("zb2rhhFAEMepUBbGyP1k8tGfz7BSciKXP6GHuUeUsJBaK6cqG")
+	exp, err := Decode("bafkreie5qrjvaw64n4tjm6hbnm7fnqvcssfed4whsjqxzslbd3jwhsk3mm")
 	if err != nil {
 		t.Fatal(err)
 	}
@@ -399,18 +483,35 @@ func TestJsonRoundTrip(t *testing.T) {
 	}
 	var actual Cid
 	err = json.Unmarshal(enc, &actual)
-	if !exp.Equals(&actual) {
+	if !exp.Equals(actual) {
 		t.Fatal("cids not equal for *Cid")
 	}

 	// Verify it works for a Cid.
-	enc, err = json.Marshal(*exp)
+	enc, err = json.Marshal(exp)
 	if err != nil {
 		t.Fatal(err)
 	}
 	var actual2 Cid
 	err = json.Unmarshal(enc, &actual2)
-	if !exp.Equals(&actual2) {
+	if !exp.Equals(actual2) {
 		t.Fatal("cids not equal for Cid")
 	}
 }
+
+func BenchmarkStringV1(b *testing.B) {
+	data := []byte("this is some test content")
+	hash, _ := mh.Sum(data, mh.SHA2_256, -1)
+	cid := NewCidV1(Raw, hash)
+
+	b.ReportAllocs()
+	b.ResetTimer()
+
+	count := 0
+	for i := 0; i < b.N; i++ {
+		count += len(cid.String())
+	}
+	if count != 49*b.N {
+		b.FailNow()
+	}
+}
--- a/format.go
+++ b/format.go
@@ -1,151 +0,0 @@
-package cid
-
-import (
-	"bytes"
-	"fmt"
-
-	mb "github.com/multiformats/go-multibase"
-	mh "github.com/multiformats/go-multihash"
-)
-
-// FormatRef is a string documenting the format string for the Format function
-const FormatRef = `
-   %% literal %
-   %b multibase name
-   %B multibase code
-   %v version string
-   %V version number
-   %c codec name
-   %C codec code
-   %h multihash name
-   %H multihash code
-   %L hash digest length
-   %m multihash encoded in base %b (with multibase prefix)
-   %M multihash encoded in base %b without multibase prefix
-   %d hash digest encoded in base %b (with multibase prefix)
-   %D hash digest encoded in base %b without multibase prefix
-   %s cid string encoded in base %b (1)
-   %S cid string encoded in base %b without multibase prefix
-   %P cid prefix: %v-%c-%h-%L
-
-(1) For CID version 0 the multibase must be base58btc and no prefix is
-used.  For Cid version 1 the multibase prefix is included.
-`
-
-// Format formats a cid according to the format specificer as
-// documented in the FormatRef constant
-func Format(fmtStr string, base mb.Encoding, cid *Cid) (string, error) {
-	p := cid.Prefix()
-	var out bytes.Buffer
-	var err error
-	encoder, err := mb.NewEncoder(base)
-	if err != nil {
-		return "", err
-	}
-	for i := 0; i < len(fmtStr); i++ {
-		if fmtStr[i] != '%' {
-			out.WriteByte(fmtStr[i])
-			continue
-		}
-		i++
-		if i >= len(fmtStr) {
-			return "", FormatStringError{"premature end of format string", ""}
-		}
-		switch fmtStr[i] {
-		case '%':
-			out.WriteByte('%')
-		case 'b': // base name
-			out.WriteString(baseToString(base))
-		case 'B': // base code
-			out.WriteByte(byte(base))
-		case 'v': // version string
-			fmt.Fprintf(&out, "cidv%d", p.Version)
-		case 'V': // version num
-			fmt.Fprintf(&out, "%d", p.Version)
-		case 'c': // codec name
-			out.WriteString(codecToString(p.Codec))
-		case 'C': // codec code
-			fmt.Fprintf(&out, "%d", p.Codec)
-		case 'h': // hash fun name
-			out.WriteString(hashToString(p.MhType))
-		case 'H': // hash fun code
-			fmt.Fprintf(&out, "%d", p.MhType)
-		case 'L': // hash length
-			fmt.Fprintf(&out, "%d", p.MhLength)
-		case 'm', 'M': // multihash encoded in base %b
-			out.WriteString(encode(encoder, cid.Hash(), fmtStr[i] == 'M'))
-		case 'd', 'D': // hash digest encoded in base %b
-			dec, err := mh.Decode(cid.Hash())
-			if err != nil {
-				return "", err
-			}
-			out.WriteString(encode(encoder, dec.Digest, fmtStr[i] == 'D'))
-		case 's': // cid string encoded in base %b
-			str, err := cid.StringOfBase(base)
-			if err != nil {
-				return "", err
-			}
-			out.WriteString(str)
-		case 'S': // cid string without base prefix
-			out.WriteString(encode(encoder, cid.Bytes(), true))
-		case 'P': // prefix
-			fmt.Fprintf(&out, "cidv%d-%s-%s-%d",
-				p.Version,
-				codecToString(p.Codec),
-				hashToString(p.MhType),
-				p.MhLength,
-			)
-		default:
-			return "", FormatStringError{"unrecognized specifier in format string", fmtStr[i-1 : i+1]}
-		}
-
-	}
-	return out.String(), err
-}
-
-// FormatStringError is the error return from Format when the format
-// string is ill formed
-type FormatStringError struct {
-	Message   string
-	Specifier string
-}
-
-func (e FormatStringError) Error() string {
-	if e.Specifier == "" {
-		return e.Message
-	} else {
-		return fmt.Sprintf("%s: %s", e.Message, e.Specifier)
-	}
-}
-
-func baseToString(base mb.Encoding) string {
-	baseStr, ok := mb.EncodingToStr[base]
-	if !ok {
-		return fmt.Sprintf("base?%c", base)
-	}
-	return baseStr
-}
-
-func codecToString(num uint64) string {
-	name, ok := CodecToStr[num]
-	if !ok {
-		return fmt.Sprintf("codec?%d", num)
-	}
-	return name
-}
-
-func hashToString(num uint64) string {
-	name, ok := mh.Codes[num]
-	if !ok {
-		return fmt.Sprintf("hash?%d", num)
-	}
-	return name
-}
-
-func encode(base mb.Encoder, data []byte, strip bool) string {
-	str := base.Encode(data)
-	if strip {
-		return str[1:]
-	}
-	return str
-}
--- a/format_test.go
+++ b/format_test.go
@@ -1,73 +0,0 @@
-package cid
-
-import (
-	"fmt"
-	"testing"
-
-	mb "github.com/multiformats/go-multibase"
-)
-
-func TestFmt(t *testing.T) {
-	cids := map[string]string{
-		"cidv0": "QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn",
-		"cidv1": "zdj7WfLr9DhLrb1hsoSi4fSdjjxuZmeqgEtBPWxMLtPbDNbFD",
-	}
-	tests := []struct {
-		cidId   string
-		newBase mb.Encoding
-		fmtStr  string
-		result  string
-	}{
-		{"cidv0", -1, "%P", "cidv0-protobuf-sha2-256-32"},
-		{"cidv0", -1, "%b-%v-%c-%h-%L", "base58btc-cidv0-protobuf-sha2-256-32"},
-		{"cidv0", -1, "%s", "QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn"},
-		{"cidv0", -1, "%S", "QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn"},
-		{"cidv0", -1, "ver#%V/#%C/#%H/%L", "ver#0/#112/#18/32"},
-		{"cidv0", -1, "%m", "zQmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn"},
-		{"cidv0", -1, "%M", "QmUNLLsPACCz1vLxQVkXqqLX5R1X345qqfHbsf67hvA3Nn"},
-		{"cidv0", -1, "%d", "z72gdmFAgRzYHkJzKiL8MgMMRW3BTSCGyDHroPxJbxMJn"},
-		{"cidv0", -1, "%D", "72gdmFAgRzYHkJzKiL8MgMMRW3BTSCGyDHroPxJbxMJn"},
-		{"cidv0", 'B', "%S", "CIQFTFEEHEDF6KLBT32BFAGLXEZL4UWFNWM4LFTLMXQBCERZ6CMLX3Y"},
-		{"cidv0", 'B', "%B%S", "BCIQFTFEEHEDF6KLBT32BFAGLXEZL4UWFNWM4LFTLMXQBCERZ6CMLX3Y"},
-		{"cidv1", -1, "%P", "cidv1-protobuf-sha2-256-32"},
-		{"cidv1", -1, "%b-%v-%c-%h-%L", "base58btc-cidv1-protobuf-sha2-256-32"},
-		{"cidv1", -1, "%s", "zdj7WfLr9DhLrb1hsoSi4fSdjjxuZmeqgEtBPWxMLtPbDNbFD"},
-		{"cidv1", -1, "%S", "dj7WfLr9DhLrb1hsoSi4fSdjjxuZmeqgEtBPWxMLtPbDNbFD"},
-		{"cidv1", -1, "ver#%V/#%C/#%H/%L", "ver#1/#112/#18/32"},
-		{"cidv1", -1, "%m", "zQmYFbmndVP7QqAVWyKhpmMuQHMaD88pkK57RgYVimmoh5H"},
-		{"cidv1", -1, "%M", "QmYFbmndVP7QqAVWyKhpmMuQHMaD88pkK57RgYVimmoh5H"},
-		{"cidv1", -1, "%d", "zAux4gVVsLRMXtsZ9fd3tFEZN4jGYB6kP37fgoZNTc11H"},
-		{"cidv1", -1, "%D", "Aux4gVVsLRMXtsZ9fd3tFEZN4jGYB6kP37fgoZNTc11H"},
-		{"cidv1", 'B', "%s", "BAFYBEIETJGSRL3EQPQPCABV3G6IUBYTSIFVQ24XRRHD3JUETSKLTGQ7DJA"},
-		{"cidv1", 'B', "%S", "AFYBEIETJGSRL3EQPQPCABV3G6IUBYTSIFVQ24XRRHD3JUETSKLTGQ7DJA"},
-		{"cidv1", 'B', "%B%S", "BAFYBEIETJGSRL3EQPQPCABV3G6IUBYTSIFVQ24XRRHD3JUETSKLTGQ7DJA"},
-	}
-	for _, tc := range tests {
-		name := fmt.Sprintf("%s/%s", tc.cidId, tc.fmtStr)
-		if tc.newBase != -1 {
-			name = fmt.Sprintf("%s/%c", name, tc.newBase)
-		}
-		cidStr := cids[tc.cidId]
-		t.Run(name, func(t *testing.T) {
-			testFmt(t, cidStr, tc.newBase, tc.fmtStr, tc.result)
-		})
-	}
-}
-
-func testFmt(t *testing.T, cidStr string, newBase mb.Encoding, fmtStr string, result string) {
-	cid, err := Decode(cidStr)
-	if err != nil {
-		t.Fatal(err)
-	}
-	base := newBase
-	if newBase == -1 {
-		base, _ = ExtractEncoding(cidStr)
-	}
-	str, err := Format(fmtStr, base, cid)
-	if err != nil {
-		t.Fatal(err)
-	}
-	if str != result {
-		t.Error(fmt.Sprintf("expected: %s; but got: %s", result, str))
-	}
-}
--- a/go.mod
+++ b/go.mod
@@ -0,0 +1,6 @@
+module github.com/ipfs/go-cid
+
+require (
+	github.com/multiformats/go-multibase v0.0.1
+	github.com/multiformats/go-multihash v0.0.1
+)
--- a/go.sum
+++ b/go.sum
@@ -0,0 +1,20 @@
+github.com/gxed/hashland/keccakpg v0.0.1 h1:wrk3uMNaMxbXiHibbPO4S0ymqJMm41WiudyFSs7UnsU=
+github.com/gxed/hashland/keccakpg v0.0.1/go.mod h1:kRzw3HkwxFU1mpmPP8v1WyQzwdGfmKFJ6tItnhQ67kU=
+github.com/gxed/hashland/murmur3 v0.0.1 h1:SheiaIt0sda5K+8FLz952/1iWS9zrnKsEJaOJu4ZbSc=
+github.com/gxed/hashland/murmur3 v0.0.1/go.mod h1:KjXop02n4/ckmZSnY2+HKcLud/tcmvhST0bie/0lS48=
+github.com/minio/blake2b-simd v0.0.0-20160723061019-3f5f724cb5b1 h1:lYpkrQH5ajf0OXOcUbGjvZxxijuBwbbmlSxLiuofa+g=
+github.com/minio/blake2b-simd v0.0.0-20160723061019-3f5f724cb5b1/go.mod h1:pD8RvIylQ358TN4wwqatJ8rNavkEINozVn9DtGI3dfQ=
+github.com/minio/sha256-simd v0.0.0-20190131020904-2d45a736cd16 h1:5W7KhL8HVF3XCFOweFD3BNESdnO8ewyYTFT2R+/b8FQ=
+github.com/minio/sha256-simd v0.0.0-20190131020904-2d45a736cd16/go.mod h1:2FMWW+8GMoPweT6+pI63m9YE3Lmw4J71hV56Chs1E/U=
+github.com/mr-tron/base58 v1.1.0 h1:Y51FGVJ91WBqCEabAi5OPUz38eAx8DakuAm5svLcsfQ=
+github.com/mr-tron/base58 v1.1.0/go.mod h1:xcD2VGqlgYjBdcBLw+TuYLr8afG+Hj8g2eTVqeSzSU8=
+github.com/multiformats/go-base32 v0.0.3 h1:tw5+NhuwaOjJCC5Pp82QuXbrmLzWg7uxlMFp8Nq/kkI=
+github.com/multiformats/go-base32 v0.0.3/go.mod h1:pLiuGC8y0QR3Ue4Zug5UzK9LjgbkL8NSQj0zQ5Nz/AA=
+github.com/multiformats/go-multibase v0.0.1 h1:PN9/v21eLywrFWdFNsFKaU04kLJzuYzmrJR+ubhT9qA=
+github.com/multiformats/go-multibase v0.0.1/go.mod h1:bja2MqRZ3ggyXtZSEDKpl0uO/gviWFaSteVbWT51qgs=
+github.com/multiformats/go-multihash v0.0.1 h1:HHwN1K12I+XllBCrqKnhX949Orn4oawPkegHMu2vDqQ=
+github.com/multiformats/go-multihash v0.0.1/go.mod h1:w/5tugSrLEbWqlcgJabL3oHFKTwfvkofsjW2Qa1ct4U=
+golang.org/x/crypto v0.0.0-20190211182817-74369b46fc67 h1:ng3VDlRp5/DHpSWl02R4rM9I+8M2rhmsuLwAMmkLQWE=
+golang.org/x/crypto v0.0.0-20190211182817-74369b46fc67/go.mod h1:6SG95UA2DQfeDnfUPMdvaQW0Q7yPrPDi9nlGo2tz2b4=
+golang.org/x/sys v0.0.0-20190219092855-153ac476189d h1:Z0Ahzd7HltpJtjAHHxX8QFP3j1yYgiuvjbjRzDj/KH0=
+golang.org/x/sys v0.0.0-20190219092855-153ac476189d/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
--- a/package.json
+++ b/package.json
@@ -9,15 +9,15 @@
  "gxDependencies": [
    {
      "author": "whyrusleeping",
-      "hash": "QmPnFwZ2JXKnXgMw8CdBPxn7FWh6LLdjUjxV1fKHuJnkr8",
+      "hash": "QmerPMzPk1mJVowm8KgmoknWa4yCYvvugMPsgWmDNUvDLW",
      "name": "go-multihash",
-      "version": "1.0.8"
+      "version": "1.0.9"
    },
    {
      "author": "whyrusleeping",
-      "hash": "QmSbvata2WqNkqGtZNg8MR3SKwnB8iQ7vTPJgWqB8bC5kR",
+      "hash": "QmekxXDhCxCJRNuzmHreuaT3BsuJcsjcXWNrtV9C8DRHtd",
      "name": "go-multibase",
-      "version": "0.2.7"
+      "version": "0.3.0"
    }
  ],
  "gxVersion": "0.8.0",
@@ -25,6 +25,6 @@
  "license": "MIT",
  "name": "go-cid",
  "releaseCmd": "git commit -a -m \"gx publish $VERSION\"",
-  "version": "0.7.24"
+  "version": "0.9.3"
 }

--- a/set.go
+++ b/set.go
@@ -3,28 +3,28 @@ package cid
 // Set is a implementation of a set of Cids, that is, a structure
 // to which holds a single copy of every Cids that is added to it.
 type Set struct {
-	set map[string]struct{}
+	set map[Cid]struct{}
 }

 // NewSet initializes and returns a new Set.
 func NewSet() *Set {
-	return &Set{set: make(map[string]struct{})}
+	return &Set{set: make(map[Cid]struct{})}
 }

 // Add puts a Cid in the Set.
-func (s *Set) Add(c *Cid) {
-	s.set[string(c.Bytes())] = struct{}{}
+func (s *Set) Add(c Cid) {
+	s.set[c] = struct{}{}
 }

 // Has returns if the Set contains a given Cid.
-func (s *Set) Has(c *Cid) bool {
-	_, ok := s.set[string(c.Bytes())]
+func (s *Set) Has(c Cid) bool {
+	_, ok := s.set[c]
 	return ok
 }

 // Remove deletes a Cid from the Set.
-func (s *Set) Remove(c *Cid) {
-	delete(s.set, string(c.Bytes()))
+func (s *Set) Remove(c Cid) {
+	delete(s.set, c)
 }

 // Len returns how many elements the Set has.
@@ -33,18 +33,17 @@ func (s *Set) Len() int {
 }

 // Keys returns the Cids in the set.
-func (s *Set) Keys() []*Cid {
-	out := make([]*Cid, 0, len(s.set))
+func (s *Set) Keys() []Cid {
+	out := make([]Cid, 0, len(s.set))
 	for k := range s.set {
-		c, _ := Cast([]byte(k))
-		out = append(out, c)
+		out = append(out, k)
 	}
 	return out
 }

 // Visit adds a Cid to the set only if it is
 // not in it already.
-func (s *Set) Visit(c *Cid) bool {
+func (s *Set) Visit(c Cid) bool {
 	if !s.Has(c) {
 		s.Add(c)
 		return true
@@ -55,9 +54,8 @@ func (s *Set) Visit(c *Cid) bool {

 // ForEach allows to run a custom function on each
 // Cid in the set.
-func (s *Set) ForEach(f func(c *Cid) error) error {
-	for cs := range s.set {
-		c, _ := Cast([]byte(cs))
+func (s *Set) ForEach(f func(c Cid) error) error {
+	for c := range s.set {
 		err := f(c)
 		if err != nil {
 			return err
--- a/set_test.go
+++ b/set_test.go
@@ -8,7 +8,7 @@ import (
 	mh "github.com/multiformats/go-multihash"
 )

-func makeRandomCid(t *testing.T) *Cid {
+func makeRandomCid(t *testing.T) Cid {
 	p := make([]byte, 256)
 	_, err := rand.Read(p)
 	if err != nil {
@@ -20,11 +20,7 @@ func makeRandomCid(t *testing.T) *Cid {
 		t.Fatal(err)
 	}

-	cid := &Cid{
-		codec:   7,
-		version: 1,
-		hash:    h,
-	}
+	cid := NewCidV1(7, h)

 	return cid
 }
@@ -54,8 +50,8 @@ func TestSet(t *testing.T) {
 		t.Error("visit should return false")
 	}

-	foreach := []*Cid{}
-	foreachF := func(c *Cid) error {
+	foreach := []Cid{}
+	foreachF := func(c Cid) error {
 		foreach = append(foreach, c)
 		return nil
 	}
@@ -68,7 +64,7 @@ func TestSet(t *testing.T) {
 		t.Error("ForEach should have visited 1 element")
 	}

-	foreachErr := func(c *Cid) error {
+	foreachErr := func(c Cid) error {
 		return errors.New("test")
 	}

--- a/varint.go
+++ b/varint.go
@@ -0,0 +1,34 @@
+package cid
+
+// Version of varint function that work with a string rather than
+// []byte to avoid unnecessary allocation
+
+// Copyright 2011 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license as given at https://golang.org/LICENSE
+
+// uvarint decodes a uint64 from buf and returns that value and the
+// number of characters read (> 0). If an error occurred, the value is 0
+// and the number of bytes n is <= 0 meaning:
+//
+// 	n == 0: buf too small
+// 	n  < 0: value larger than 64 bits (overflow)
+// 	        and -n is the number of bytes read
+//
+func uvarint(buf string) (uint64, int) {
+	var x uint64
+	var s uint
+	// we have a binary string so we can't use a range loope
+	for i := 0; i < len(buf); i++ {
+		b := buf[i]
+		if b < 0x80 {
+			if i > 9 || i == 9 && b > 1 {
+				return 0, -(i + 1) // overflow
+			}
+			return x | uint64(b)<<s, i + 1
+		}
+		x |= uint64(b&0x7f) << s
+		s += 7
+	}
+	return 0, 0
+}
--- a/varint_test.go
+++ b/varint_test.go
@@ -0,0 +1,22 @@
+package cid
+
+import (
+	"encoding/binary"
+	"testing"
+)
+
+func TestUvarintRoundTrip(t *testing.T) {
+	testCases := []uint64{0, 1, 2, 127, 128, 129, 255, 256, 257, 1<<63 - 1}
+	for _, tc := range testCases {
+		buf := make([]byte, 16)
+		binary.PutUvarint(buf, tc)
+		v, l1 := uvarint(string(buf))
+		_, l2 := binary.Uvarint(buf)
+		if tc != v {
+			t.Errorf("roundtrip failed expected %d but got %d", tc, v)
+		}
+		if l1 != l2 {
+			t.Errorf("length incorrect expected %d but got %d", l2, l1)
+		}
+	}
+}
Author	SHA1	Message	Date
Steven Allen	9bb7ea6920	Merge pull request #86 from lidel/feat/libp2p-key Add support for libp2p-key multicodec to go-cid	2019-05-27 16:46:10 -07:00
Marcin Rataj	3f1777738f	Add libp2p-key multicodec Context: https://github.com/multiformats/multicodec/issues/130 License: MIT Signed-off-by: Marcin Rataj <lidel@lidel.org>	2019-05-28 01:40:54 +02:00
Steven Allen	b1cc3e404d	Merge pull request #85 from ipfs/feat/cidv1-default-base32 default cidv1 to base32	2019-05-13 10:54:14 -07:00
Steven Allen	b16425b966	make CID in readme base32	2019-05-10 09:41:01 -07:00
Steven Allen	f04f9216e7	default cidv1 to base32	2019-05-06 16:22:17 -07:00
Jakub Sztandera	e7e67e08cf	Add gomod and travis	2019-02-28 18:32:58 +01:00
Steven Allen	29a66d1820	gx publish 0.9.3	2019-02-20 20:12:25 -08:00
Steven Allen	08f30d213e	Merge pull request #84 from ipfs/fix/83 fix inline CIDs generated by Prefix.Sum	2019-02-20 20:11:24 -08:00
Steven Allen	cf3b4efcaf	fix inline CIDs generated by Prefix.Sum	2019-02-20 19:06:04 -08:00
Hector Sanjuan	ca991e8eb6	Merge pull request #82 from ipfs/gx/0.9.2 gx publish 0.9.2	2019-02-20 16:21:43 +00:00
Hector Sanjuan	8d327b2f4b	gx publish 0.9.2	2019-02-20 16:21:01 +00:00
Hector Sanjuan	14b828acf5	Merge pull request #81 from ipfs/feat/binary-marshaler Let Cid implement Binary[Un]Marshaler and Text[Un]Marshaler interfaces.	2019-02-20 16:20:09 +00:00
Hector Sanjuan	00439572fb	Let Cid implement Binary[Un]Marshaler and Text[Un]Marshaler interfaces. This makes Cid implement https://golang.org/pkg/encoding/#BinaryMarshaler which is used by go-codec to decide if things know how to serialize themselves (currently we need do manual wrapping for anything containing a CID). Since I was at it, I did the TextMarshaling one too.	2019-02-19 16:41:30 +00:00
Jakub Sztandera	37bf2f9503	Merge pull request #80 from madper/fix_typo fix typo in comment	2019-02-15 18:30:48 +01:00
Madper Xie	e6d04f280e	fix typo in comment	2019-02-15 20:19:46 +08:00
Steven Allen	033594dcd6	gx publish 0.9.1	2018-11-02 16:51:23 -07:00
Steven Allen	c9e99b39db	Merge pull request #78 from samli88/dash-codecs add codecs for Dash blocks, tx	2018-10-23 05:29:30 +01:00
Samuel Li	3ec3578fe9	add dash to codecs table	2018-10-07 11:44:18 -07:00
Samuel Li	628ab3426c	add codecs for Dash blocks, tx	2018-10-07 11:21:41 -07:00
Kevin Atkinson	6e296c5c49	gx publish 0.9.0	2018-09-11 19:18:20 -04:00
Kevin Atkinson	f0033600ca	Gx update go-multibase.	2018-09-11 19:17:40 -04:00
Kevin Atkinson	dfc48d3ec4	Make sure we have a SHA2_256, length 32 hash when creating a CidV0.	2018-09-11 19:17:40 -04:00
Kevin Atkinson	46dd393ad1	Handel undefined Cid is JSON representation.	2018-09-07 14:03:03 -04:00
Kevin Atkinson	67a2bcf7e7	Change 'Nil' constant to 'Undef'.	2018-09-05 15:43:18 -04:00
Kevin Atkinson	643f78a8f9	Change 'IsNil' method to 'Defined'.	2018-09-05 03:26:26 -04:00
Kevin Atkinson	7b4617fa6e	Eliminate unnecessary copy of Cid now that its an immutable string.	2018-09-01 00:09:38 -04:00
Kevin Atkinson	440a1c1a5a	Removed description of layout of CID as it is not correct for CIDv0.	2018-08-31 00:35:55 -04:00
Kevin Atkinson	e0a5698af9	Add IsNil() method.	2018-08-31 00:35:54 -04:00
Kevin Atkinson	667c6a9418	Avoid allocating memory in Type() method.	2018-08-31 00:35:53 -04:00
Kevin Atkinson	426ebe9e55	Simplify assignment in UnmarshalJSON.	2018-08-31 00:35:52 -04:00
Kevin Atkinson	cad52160a4	Ensure we always have a valid Cid by hiding the type in a struct.	2018-08-31 00:35:51 -04:00
Kevin Atkinson	b5a08dcaaa	Change EmptyCid to just Nil.	2018-08-31 00:35:51 -04:00
Kevin Atkinson	9831436a6f	Change string representation to represent actual binary representation.	2018-08-31 00:35:47 -04:00
Kevin Atkinson	d7974d2277	Export version() method, various other code cleanups.	2018-08-31 00:34:19 -04:00
dignifiedquire	8009448a20	fix KeyString()	2018-08-31 00:34:03 -04:00
dignifiedquire	92496b5494	use string instead of []byte as underlying store	2018-08-31 00:34:02 -04:00
dignifiedquire	e153340e5a	feat: use CIDs as their byte representation instead of a struct	2018-08-31 00:33:47 -04:00
Steven Allen	6ddb575a8d	Merge pull request #60 from ipfs/kevina/cid-fmtb Create a new Encode method that is like StringOfBase but never errors	2018-08-30 23:06:50 +00:00
Kevin Atkinson	b3d85b3dee	Enhance documentation for Encode method.	2018-08-30 00:38:17 -04:00
Kevin Atkinson	bea727bbd1	Enhance tests.	2018-08-30 00:30:14 -04:00
Kevin Atkinson	9091e50b29	Rename Format method to Encode.	2018-08-30 00:30:14 -04:00
Kevin Atkinson	a0b3b11e63	Create a new Format method that is like StringOfBase but never errors	2018-08-30 00:30:14 -04:00
Steven Allen	1766ab0fcf	Merge pull request #72 from ipfs/rsrch-cid-as-struct-wrapped-str cid implementation variations++	2018-08-30 01:52:53 +00:00
Eric Myhre	924534b811	Inspect memory layout of struct wrapping string. It's also viable. Options list expanded. (And regretting my ordering of it now. Wish I'd thought of this one and realized it's distinct earlier.)	2018-08-28 01:00:21 +02:00
Eric Myhre	5ddbe21740	Merge pull request #70 from ipfs/rsrch cid implementation research	2018-08-28 00:34:28 +02:00
Eric Myhre	2cf56e3813	Benchmarks of various Cid types as map keys. And writeup on same. tl;dr interfaces are not cheap if you're already at the scale where you started caring about whether or not you have pointers.	2018-08-24 14:03:54 +02:00
Eric Myhre	5a6d4bdf06	More readme on the state of iface research.	2018-08-24 13:18:50 +02:00
Eric Myhre	fb8ecaccad	Enumerate some more options in prose.	2018-08-24 12:37:52 +02:00
Eric Myhre	348b9201a6	Start a readme for this research project. Right now this is mostly this is to document the behavior of interface-keyed maps. I suspect some of those caveats might be non-obvious to a lot of folks.	2018-08-24 12:24:34 +02:00
Eric Myhre	b4ab25ffda	Discovered interesting case in map key checking. Using interfaces as a map's key type can cause some things that were otherwise compile-time checks to be pushed off into runtime checks instead. This is a pretty major "caveat emptor" if you use interface-keyed maps.	2018-08-24 12:18:07 +02:00
Eric Myhre	c724ad0d22	cid impl via struct and via string together. Added back in some of the parser methods. (These were previously named "Cast" and I think that's silly and wrong so I fixed it.) Functions are named overly-literally with their type (e.g. ParseCidString and ParseCidStruct rather than ParseCid or even just Parse) because for this research package I don't want to bother with many sub-packages. (Maybe I'll regret this, but at the moment it seems simpler to hold back on sub-packages.) Functions that produce Cids are literal with their return types, as well. Part of the purpose of this research package is going to be to concretely benchmark exactly how much performance overhead there is to using interfaces (which will likely cause a lot of boxing and unboxing in practice) -- since we want to explore where this boxing happens and how much it costs, it's important that none of our basic implementation functions do the boxing! The entire set of codec enums came along in this commit. Ah well; they would have eventually anyway, I guess. But it's interesting to note the only thing that dragged them along so far is the reference to 'DagProtobuf' when constructing v0 CIDs; otherwise, this enum is quite unused here.	2018-08-24 12:00:31 +02:00
Eric Myhre	ff25e9673c	Open research dir; want to explore cid impl perf. It's been discussed in several issues and PRs already that we might want to explore various ways of implementing CIDs for maximum performance and ease-of-use because they show up extremely often. Current CIDs are pointers, which generally speaking means you can't get one without a malloc; and also, they're not particularly well-suited for use in map keys. This branch is to attempt to consolidate all the proposals so far -- and do so in a single branch which can be checked out and contains all the proposals at once, because this will make it easy to do benchmarks and compare all of the various ways we could implement this in one place (and also easier for humans to track what the latest of each proposal is, since they're all in one place). To start with: a Cid implementation backed by a string; and matching interface. (I'm also taking this opportunity to be as minimalistic as possible in what I port over into these experimental new Cid implementations. This might not last; but as long as all this work is to be done, it's a more convenient time than usual to see what can be stripped down and still get work done.) More to come.	2018-08-24 10:53:52 +02:00
Kevin Atkinson	afcde25c66	gx publish 0.8.0	2018-08-21 15:15:34 -04:00
Kevin Atkinson	fb85ebd768	Merge pull request #69 from ipfs/kevina/extract Extract non-core functionality from go-cid into go-cidutil	2018-08-21 15:11:14 -04:00
Kevin Atkinson	870aa9e7de	Extract non-core functionality from go-cid into go-cidutil.	2018-08-16 21:51:31 -04:00
Steven Allen	73e5246a65	gx publish 0.7.25	2018-08-15 08:24:56 -07:00
Steven Allen	83a7594d41	Merge pull request #67 from ipfs/feat/streaming-set add a streaming CID set	2018-08-11 01:06:53 +00:00
Łukasz Magiera	3655c1cdd4	add a streaming CID set used in https://github.com/ipfs/go-ipfs/pull/4804	2018-08-10 17:32:43 -07:00
Steven Allen	1543f4a136	Merge pull request #44 from ipfs/feat/bench add String benchmark	2018-08-10 23:55:30 +00:00
Steven Allen	d6e0b4e5a7	add String benchmark We call String all over the place so we should make sure it remains fast.	2018-08-10 16:23:25 -07:00