Skip to main content
. 2020 Mar 5;101(3):271–283. doi: 10.1099/jgv.0.001387

Table 1.

Proposed reference sequences for HBV genotypes, subgenotypes and clades

The number of sequences in each clade is given for each subgenotype and clade identified. Note that the total sum of the subgenotype sequence clusters may not correspond to the total number of genotype sequences, as a number of sequences did not group within a specific clade. Reference sequences for the genotypes are highlighted in grey boxes. In Figs 3–10, genotype references are marked with blue dots and subgenotype reference sequences are marked with red dots. Subgenotypes B5, C7, C9 and D6 are not included, as these sequences either did not cluster as monophyletic clades or were not retained in our analysis. Hamming distance indicates the number of nucleotide differences between the clade consensus and the chosen reference. The pairwise distance is the number of nucleotide differences between the clade consensus and the chosen reference normalized by length of the genome.

HBV genotype

Subgenotype

No. of sequences

Reference

GenBank ID

Hamming distance

Pairwise distance

References

Collection year*

Country of origin

A

A1 (1)

50

KP168423

26

0.008

[55]

2012

Kenya

A1 (2)

87

FJ692557

21

0.007

[56]

2006

Haiti

A2

80

EU594385

5

0.002

[57]

2004

Estonia

A3

9

AM184126

29

0.009

[58]

2005

Gabon

A4

2

KM606737

n/a

n/a

[59]

2015

Cuba

A5

25

FJ692601

10

0.003

[60]

2006

Haiti

A6

2

GQ331046

n/a

n/a

[37]

2006

Belgium

B

B1

47

D23679

23

0.007

[61]

1993

Japan

B2 (1)

131

JQ801514

15

0.005

[62]

2009

Thailand

B2 (2)

294

GU815637

10

0.003

[63]

2010

Taiwan, ROC

B3

106

AP011085

23

0.007

[64]

2001

Indonesia

B4

69

AB073835

35

0.011

[65]

2001

Japan

B6

36

AB287314

29

0.009

[66]

2006

Alaska

C

C1†

240

DQ089781

29

0.009

[67]

2005

Hong Kong SAR

C2 (1)

261

KC774182

21

0.007

[68]

2012

PR China

C2 (2)

280

GQ377617

14

0.004

[69]

2007

PR China

C2 (3)

157

AP011098

10

0.003

[70]

2009

Indonesia

C4

21

KF873526

68

0.021

[71]

2011

Australia

C5

15

AP011099

41

0.013

[70]

2009

Indonesia

C6‡

2

EU670263

28

0.009

[29]

2008

Philippines

C8

15

AP011107

42

0.013

[70]

2009

Indonesia

C10

21

KJ173333

33

0.010

[72]

2012

PR China

C11§

28

AB554015

86

0.027

[28]

2010

Indonesia

UA (1)

31

KC774298

15

0.005

[68]

2012

PR China

UA (2)

16

DQ089802

55

0.018

[67]

2005

Hong Kong SAR

D

D1 (1)

216

AB222711

11

0.003

[73]

2005

Uzbekistan

D1 (2)

106

KC875277

17

0.005

[74]

2013

India

D2

100

MF925358

21

0.007

[75]

2015

Bangladesh

D3

78

FJ692507

18

0.006

[60]

2006

Haiti

D4

15

FJ692533

17

0.008

[60]

2006

Haiti

D5

15

GQ205389

19

0.006

[76]

2008

India

D7

15

FJ904435

44

0.014

[77]

2006

Tunisia

E

n/a

145

GQ161817

19

0.006

[78]

2006

Guinea

F

F1

26

HM585194

13

0.004

[79]

2007

Chile

F2

13

DQ899143

26

0.008

[80]

2006

Venezuela

F3

19

MH051986

17

0.005

[81]

2011

Venezuela

F4

18

KJ843175

17

0.005

[82]

2012

Argentina

G

n/a

3

AB056513

n/a

n/a

[83]

2001

USA

H

n/a

11

FJ356715

36

0.011

[84]

2008

Argentina

I

I1

5

AB562463

17

0.005

[85]

2007

Vietnam

I2

4

FJ023669

17

0.005

[86]

2008

Laos

n/a

1

AB486012

n/a

n/a

[86]

2006

Japan/Borneo

n/a, not applicable; (Subgenotype) this genotype does not diverge into multiple subtypes; (Hamming/Pairwise distance), too few sequences identified belonging to the genotype/subgenotype to generate consensus sequences for selection of closest biological isolate.

*Collection date of sample or year submitted to GenBank (if collection date not given).

†C1 is a large clade that also contains sequences labelled as subgenotype C3, with no clear separation between the two sets of sequences.

‡This sequence has been used previously as a subgenotype C7 reference in a number of publications [21, 22]. No other putative C7 sequences are proposed in the literature.

§C11, a large number of sequences were unpublished in this clade (>30 first closest seqs).

¶Genotype J remains putative, with a single isolate identified in a Japanese patient. The isolate shows considerable divergence from other known HBV strains and is thought to be a recombinant of genotype C and a gibbon HBV isolate.