-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathJozefThesis.txt
3263 lines (2577 loc) · 153 KB
/
JozefThesis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
under supervision of
–
: , ,
[note:
]
For a research worker the unforgotten moments of his life are
those rare ones,
which come after years of plodding work,
when the veil over nature’s secret seems suddenly to lift,
and when what was dark and chaotic
appears in a clear and beautiful light and pattern
— Gerty Cori [103]
To my parents and teachers,
who managed to sustain the curiosity in me.
Abstract
Almost every macromolecule in the cell or living organism has to
interact transiently or permanently with other particles to
fulfill their role. Around 30–40% of these macromolecules are
envisioned to interact with metal ions, often the interaction
being obligatory for the macromolecule to be biologically active.
What is bizarre, given those numbers, is that to date, studies
dedicated to the characterization of protein–protein networks do
not overlap with the research devoted to metal–protein
interactions, leaving the area of intersection a terra incognita.
My interdisciplinary doctoral project had two phases. During the
first phase, I attempted to explore the unknown area
programmatically surveying the , for the —protein–protein
interfaces involving the metal ions. The second phase was devoted
to thorough characterization, by biophysical methods, one of the
special cases of the —the zinc hook of , crucial for
physiological dimerization of the and, consequently, necessary
for the proper functioning of the key player in damage
response—the .
The logic of the surveying was deployed using programming
language. One of the two generated in this process surveys
present the physiological (assessed manually). The second survey
is constantly and regularly updated, as is it deployed as an
online tool using the framework—InterMetalDB.
The second phase of my doctoral project—the biophysical
characterization of the zinc hook from the had two parts. The
first part was to describe the sequence–structure–stability
relationship of the human . The study is the first study,
exploring the stability of the eukaryotic , moreover, to my
knowledge, the investigated complex forms the most stable
Zn(II)-complex described in the human proteome, so far. The human
is phosphorylated in the near vicinity (structurally and
sequentially) of the zinc-binding motif, at threonine 690. We
have determined that, due to the extreme stability of the
complex, the phosphorylation of the zinc hook cannot be a switch
that controls the metallation state of the under the
physiological conditions.
Despite the extreme stability of the Zn()₂, the complex is not
selective against Zn(II), which can be replaced by metal ions
with higher affinities towards the zinc hook. The second part of
the characterization of the zinc hook was performed using the
model zinc hook domain from and was devoted to the
investigation of the structural, as well as stability properties
of the mismetallated domain by heavy metal ions—Hg(II) and Ag(I).
Those metals easily and readily displaced Zn(II) under the
experimental condition, suggesting that the stabilities of the
formed complexes were even higher those of the Zn(II)-complex. I
was able to estimate these unimaginably high affinities using the
least likely method—.
The obtained results contribute to understanding the , including
, as well as the relationship between the structure and the
stability of the human , as well as the impact of the heavy metal
ions on the stability and architecture of the from
Streszczenie
Prawie każda makrocząsteczka w komórce lub żywym organizmie musi
przejściowo lub trwale oddziaływać z innymi cząsteczkami, aby
spełnić swoją rolę. Przewiduje się, że około 30–40% tych
makrocząsteczek oddziałuje z jonami metali, przy czym często
oddziaływanie to jest niezbędne, aby makrocząsteczka była
biologicznie aktywna. Biorąc pod uwagę te liczby, dziwi fakt, że
jak dotąd badania poświęcone charakterystyce sieci oddziaływań
białko–białko nie pokrywają się z badaniami poświęconymi
interakcjom metal–białko, pozostawiając obszar przecięcia jako
terra incognita. Mój interdyscyplinarny projekt doktorski miał
dwie fazy. W pierwszej fazie podjęłam próbę eksploracji
nieznanego obszaru programowo badając zasoby Research
Collaboratory for Structural Bioinformatics Protein Data Bank,
pod kątem interakcji białko–białko z udziałem metalu—a dokładniej
interfejsów białko–białko z udziałem jonów metali. Drugi etap
poświęcony był dokładnej charakterystyce, metodami biofizycznymi,
jednego ze szczególnych przypadków interakcji białko–białko z
udziałem jonu cynku—haczyka cynkowego białka naprawczego DNA
Rad50, koronnego dla fizjologicznej dimeryzacji Rad50, a w
konsekwencji niezbędnego do prawidłowego funkcjonowania
kluczowego gracza w odpowiedzi na uszkodzenia kwasu
deoksyrybonukleinowego—kompleksu Mre11–Rad50–Nbs1(Xrs2) (MRN(X)).
Logika badania została wdrożona przy użyciu języka programowania
Python. Jeden z dwóch wygenerowanych w tym procesie przeglądów
przedstawia fizjologicznych interakcji białko–białko z udziałem
jonu cynku (ewaluowane ręcznie). Drugi przegląd jest stale i
regularnie aktualizowany, ponieważ jest wdrożony jako narzędzie
online z wykorzystaniem platformy programistycznej
Django—InterMetalDB.
Drugi etap mojego projektu doktorskiego—biofizyczna
charakterystyka haczyka cynkowego z białka Rad50 składał się z
dwóch części. Pierwszą częścią było opisanie relacji
sekwencja–struktura–stabilność ludzkiego białka Rad50. Powstała
publikacja jest pierwszą publikacją, opisującą stabilność
eukariotycznego Rad50, ponadto, według mojej wiedzy, badany
kompleks tworzy najbardziej stabilny kompleks cynkowy opisany w
ludzkim proteomie. Ludzkie białko Rad50 jest fosforylowane w
bliskim sąsiedztwie (strukturalnie i sekwencyjnie) motywu
wiążącego cynk, na treoninie 690. Pokazałem również, że ze
względu na ekstremalną stabilność kompleksu fosforylacja haka
cynkowego nie może być przełącznikiem kontrolującym stan
dimeryzacji białka Rad50 w warunkach fizjologicznych. Pomimo
ekstremalnej stabilności Zn(Rad50)₂, kompleks nie jest selektywny
wobec Zn(II), który może być zastąpiony jonami metali o wyższym
powinowactwie do haka cynkowego.
Druga część charakterystyki haczyka cynkowego została
przeprowadzona z wykorzystaniem modelowej domeny haczyka
cynkowego z Rad50 pochodzącego z z Pyrococcus furiosus (P.
furiosus) i była poświęcona badaniu właściwości strukturalnych, a
także stabilności domeny wysyconej przez jony metali
ciężkich—Hg(II) i Ag(I). Metale te w warunkach eksperymentalnych
łatwo i chętnie wypierały Zn(II), co sugeruje, że stabilność
powstałych kompleksów była nawet wyższa niż kompleksu cynkowego.
Te niewyobrażalnie wysokie powinowactwa udało mi się oszacować
przy użyciu najmniej prawdopodobnej metody—izotermicznej
kalorymetrii miareczkowej. Uzyskane wyniki przyczyniają się do
zrozumienia interakcji białko–białko z udziałem metalu, w tym
interakcji białko–białko z udziałem jonu cynku, jak również
zależności pomiędzy strukturą i stabilnością ludzkiego białka
Rad50, a także wpływu jonów metali ciężkich na stabilność i
architekturę białka Rad50 z P. furiosus.
Publications
Some ideas and figures have appeared previously in the following
publications:
Bibliography
To succeed, planning alone is insufficient.
One must improvise as well.
— Isaac Asimov [12]
Acknowledgments
Plato, and his student, Aristotle, thought that is by nature a
political animal, i.e., man is made to live in a polis, and be an
active member of society. A solitary man is not
self-sufficient—he has various needs that cannot be fulfilled
alone. An individual cannot even meet his material needs, let to
say intellectual needs, such as play, art, music, sports,
friendship, and learning. The same is true for conducting
research—this doctorate would not have been possible without the
contribution (direct or indirect) of others. These
acknowledgments are not sorted by importance, I do believe that
would be impossible to do objectively, however, I do hope that
everyone important during my doctoral project is mentioned and
recognized.
I would like to thank to:
• my supervisor Artur Krężel for being my supervisor and guidance
for past the five years,
• Violetta Trzyna for being the invaluable help with tedious and
complicated paperwork,
• Michał Padjasek for sharing various hobbies, support, and
inspiring,
• my parents and family for support,
• Adam Pomorski for insightful chats and brilliant remarks,
• Jakub Sławski, Michał Tracz and Marek Łuczkowski for common
lunches and discussions,
• Mateusz Krzyścik for technical expertise and providing a wrench
when needed,
• Anna Kocyła, Olga Kerber, Marek Łuczkowski, Aleksandra Chorążew
ska, Alicja Misiaszek and other labmates, for successful
collaboration,
• students that have I worked with: Aleksandra Gędaj and Ewa
Olczak for patience,
• Krystyna Grzesiak for understanding, patience, and support, so
much needed when conducting research and writing articles,
• Czarek Grzesiak for short and long walks,
• Gabor Markowski, Wojciech Graf ,and Krzysztof Pukało for
showing me that sport can be fun,
• Krzysztof Pukało, Wojciech Graf, Gabor Markowski, Maciej Majkow
ski and Piotr Bachry for giving me the opportunity to stress
out by playing Dungeons & Dragons,
• Katsuaki Inoue, Nathan Cowieson, Diego Gianolio for being a
local contact at Diamond Light Source.
• Alexandra Elbakyan, for her contributions to science.
Introduction
Proteins and cofactors <chap:Proteins-and-cofactors>
A significant portion of organic compounds synthesized by living
organisms are biopolymers; proteins, lipids, nucleic acids, and
other macromolecules (e.g., lignins, gums, melanins). Among these
macromolecules, proteins fulfill key roles in almost every
biological process (catalysis, transport, mechanical-structural
functions, and others). This extraordinarily rich diversity of
functions is achieved despite the fact that proteins are linear
polymers built only of 20 types of canonical amino acids[footnote:
Without taking into account and amino acid residues that are
inserted during the cotranslation process.
], those canonical amino acids linked sequentially into a linear
polymer are responsible for a staggering variety of folds,
structures, and assemblies. This gorgeous diversity of protein
structures allows proteins to fulfill an astonishing number of
functions, however, not every chemical (or physical) process is
possible to conduct with only 20 canonical amino acids. One way
evolution has solved the problem of limited chemical groups that
can facilitate important processes is the use of and . Numerous
publications state that roughly one-third of proteins interact
with [footnote:
My duty to the reader is to point out that the statement that
one-third of proteins interact with is not very credible. This
type of statement often appears in various papers discussing the
, yet one looks in vain for a reference to the literature or
information on what organism these proteins are from, or how the
analysis was done.
]Rosenzweig 2002; Bushmarina et al. 2006; Cao and Li 2011; ?? ??.
The queried for structures of enzymes return over 30% of enzymes
that contain , which points out that are essential for the
functioning of a vast number of proteins Mukhopadhyay et al. 2019
.
, contrary to the are not bound covalently to proteins, but
associate with proteins by means of other interactions, e.g.,
hydrogen bonds, Van der Waals forces, etc., thus to interact with
proteins had to evolve to form sites that will attract and
associate with . The residues (not the sequence!) that adopt
particular conformation and arrangement in space, being able to
accommodate a particular are called collectively a binding site.
Proteins are able to interact with organic, non-protein , called
, the inorganic molecules constituting are mostly metal ions.
The residues that binding sites that interact with metal ions are
called a “metal binding sitea”. Among various , the Zn(II) cation
is one of the most commonly found in proteins. The interacting
with Zn(II) are among the most diverse and widespread proteins
found in nature. Genome sequencing projects have provided a great
deal of information about the structure of primary information
about the primary structure of proteins. The bioinformatic study
of this information analyzed in terms of metal interaction
suggests that about 10% of the human proteome may bind Zn(II)
ions Andreini et al. 2006; Andreini et al. 2006; Andreini, Bertini, and Rosato 2009
. What is worth noting is the fact, that those predictions (based
solely on the protein sequence) do not take into account the fact
that the may be bound by two or more macromolecules. This
minuscule comment spans a variety of other, challenging questions
— “How often this intermolecular metal binding is occurring?”, “
What are the properties of such sites and the binding proteins?”
and many others.
This PhD project was dedicated to the analysis of proteins
binding metal ions in . The part of the project realized in
silico concerns the survey of all known proteins that bind metal
ions in an intermolecular fashion (including zinc ion), while ex
silico studies regard the interaction of metal ions with (See: [ch:Rad50]
).
1.1 Metal ions
1.1.1 Alkali and alkaline earth metal ions
Approximately 1% of the human body weight are made up of alkali
and alkaline earth metal ions Joseph J. Stephanos 2014. The role
of alkali and alkaline earth metal ions in organisms is diverse,
sodium and potassium ions occur in all known organisms, and
generally, those ions function as electrolytes. Calcium ions are
known to function as messengers—used by living organisms to
communicate and orchestrate intracellular processes. Magnesium
ions are the most abundant divalent cations in the cell, the
majority of it bound to , which is essential to utilize Schwartz et al. 2014
.
In proteins both alkali and alkaline earth metal ions commonly
play a structural role in the stabilization of the , however,
those metals can play a catalytic role as well (e.g. Mg(II) in T4
ligase Cherepanov and de Vries 2002). Alkali and alkaline-earth
compensate charge of highly acidic regions in macromolecules,
e.g., polyphosphate backbone in nucleic acids Owczarzy et al. 2004; Zheng et al. 2015; Varnai and Zakrzewska 2004
. Despite the fact that the biology of alkali and alkaline earth
metals is quite well understood, it still hides some of the
secrets, for instance, the biological effects (as well as some of
the side effects) of lithium are known; lithium carbonate is
widely used to treat some of the mood disorders, however, the
biological targets of Li(I) ions in the organisms are not known
yet.
The alkali metals have a single s electron, and the alkaline
earth metal ions have a filled outer s-orbital (by two
electrons). Both groups of metals are highly electropositive and
reluctantly polarizable Joseph J. Stephanos 2014. Due to the low
polarizability alkali and alkaline earth metal ions are
considered to be a hard Lewis acid. In accordance with Ralph
Pearson's concept, hard Lewis acids should react more willingly
with hard Lewis bases Pearson 1963. This behavior is easily
confirmed by observation of the that complex those metals, i.e.,
coordination sphere of those metal ions is usually consisting of
hard Lewis acids (in proteins, by oxygen-containing acidic amino
acids). Usually alkali and alkaline-earth metal ions are
coordinated by macromolecules with six-coordinated octahedral
geometry Kuppuraj, Dudev, and Lim 2009, however more than six
donor atoms are sometimes found in the alkali or alkali-earth
metal binding sites – mostly due to the bidentate Zheng et al. 2008
.
1.1.2 Transition metal ions <subsec:Transition-metal-ions>
Transition metals are defined as elements that have partially
filled d orbital. Transition metals due to the fact that d
orbitals are quite close in energy levels (degenerated), can form
compounds with a wide range of oxidation states, some of those
states are quite stable. The elements across the d block tend to
be more polarizable. Those two properties make the use of
transition metals useful in catalysis, thus there is no wonder
why evolution has utilized so widely transition metals in
enzymes. The first property is used when there is need for
switching the oxidation state of the enzyme's substrate. In such
enzyme, the metal ion can cycle between oxidation states and play
a role as an acceptor or donor of electrons (depending on the
catalyzed reaction). The second property allow metal ions to
function in the catalytic sites without changing the oxidation
state. Such metal cations play a role of a Lewis acid, accepting
pairs of electrons, e.g., like Zn(II) in catalytic site of Krishnamurthy et al. 2008
.
In the presence of the orbitals of the transition metal ions
break of degeneracy, this causes a variety of physio-chemical
effects (i.e., the property of having a color), but also exerts a
particular geometry of the complex. Unsurprisingly proteins have
evolved to accommodate a various geometries of the metal ions—the
metal binding sites have to match the geometry of the particular
metal ion to bind it.
Out of ten first-row transition metals, five of them are
essential to human health—manganese, iron, cobalt, copper and
zinc (not being a transition metal per se, discussed in [sec:properties_of_zinc]
). The essentiality of those elements is nonnegotiable, as those
elements are or build some of the of biologically important
proteins or enzymes.
Manganese is present as a in several enzymes, which maintain the
metabolism, for examples in the human bran manganese-dependent
glutamine synthetase is responsible for glutamine synthesis in
astrocytes Takeda 2003.
Iron accounts for 0.05‰ of human body weight. Hemoglobin is a
flagship protein that realizes the necessity of iron in human
diet. Iron in is often bound as a —heme, however, where iron
ions act as a exists as well, e.g., ferritin or rubredoxin.
Cobalt is an essential part of vitamin B[subscript:12], which is used as a coenzyme in synthesis, amino acid, and
fatty acids metabolism. Similarly to iron, cobalt does not have
to be only bound to corrin derivatives or vitamin B[subscript:12], but can be utilized by some proteins as a directly (e.g.,
methionine aminopeptidase 2).
Copper play essential roles in electron transport and oxygen
metabolism. Copper is essential for aerobic respiration in
eukaryotes, as it is found as a of cytochrome c oxidase, where
it serves the function of electron transporter. Copper is also a
coenzyme of copper-zinc , an enzyme that catalyzes the reaction
of the disproportionation of superoxides.
[float Table:
[Tabela 1.1:
Simplified version of the Periodic Table of the Elements
indicating biologically relevant metals. With green color
essential for life metals were marked. Yellow color marks the
elements that are essential for some species. Chromium is marked
in orange color as the essentiality for life of this element is
questionable.
]
]
Additionally, to the five first-row essential d-block elements,
three more first-row d-block elements—chromium, vanadium, and
nickel—show some biological effects, however, the essentiality
and need for supplementation of those elements is debatable.
Cr(III) is found in foods and dietary supplements, many of which
are advertised to have a beneficial effect on glucose metabolism
and regulation. This rationale is based on observations of
patients receiving Cr(III)-free intravenous drip who developed
symptoms similar to insulin-resistant diabetes. Symptoms were
relieved after intravenous drip supplementation with a small
amount of Cr(III) Stehle, Stoffel-Wagner, and Kuhn 2016. The
efficacy of the supplementation of Cr(III) is characterized by
evidence of low strength thus the rationale to recommend their
use for glycemic control in patients with type 2 diabetes
mellitus is precarious Costello, Dwyer, and Bailey 2016. In the
United States of America, Cr(III) ions are considered an
essential nutrient in humans ?? 2022, the European Food Safety
Authority does not share this opinion, arguing that the evidence
supporting this assumption is insufficient Agostoni et al. 2014.
While the role of Cr(III) on the human body is uncertain, the
toxicity and carcinogenic properties of Cr(VI) are known for a
long time Långard and Norseth 1975.
Vanadium seems to be more important for life in marine
environments—the number of marine life forms like algae utilizes
vanadium in enzymes Butler and Carter-Franklin 2004. Tunicates
draw attention due to specialized blood cells types—vanadocytes,
accumulating vanadium through vanadium interacting protein called
vanabins, however, the role of those proteins and the vanadium in
tunicates is still shrouded in mystery Ueki et al. 2003.
Vanadium-related issues for human health are similar to the case
of chromium—vanadium complexes may exert some antidiabetic
effects Sakurai et al. 2002, however a functional role for
vanadium in mammals and humans is not defined yet, thus it is not
recognized as an essential nutrient Institute of Medicine 2001.
Nickel is essential to some bacteria, archaea, fungi, and plants,
where usually nickel plays a role in enzymatic processes, e.g.,
urease, a nickel enzyme is considered a virulence factor in some
pathogens. The essentiality of Ni(II) in mammals is still
discussed, on the one hand, rats grown without nickel (or in low
abundance) exhibit depressed growth, and several outside-the-norm
biochemical and morphology results. Nickel seemed to be involved
in the reproduction of nickel-deprived rats, as the lack of it
resulted in an increased perinatal mortality rate Zambelli and Ciurli 2013
. Though no physiological nickel targets in mammals and humans
have been found, the vast usage of Ni(II) by prokaryotes suggests
that the element might be essential for the symbiotic gut
microbiota Zambelli and Ciurli 2013.
Zinc proteome <chap:Zinc proteome>
2.1 Properties of zinc <sec:properties_of_zinc>
As have been mentioned in the first sentence of [subsec:Transition-metal-ions]
transition metals have partially filled d orbital, following
this definition one cannot classify zinc as a transition metal ([fig:Bohr-model-diagram-zinc]
). The other definitions of transition metals (i.e. as chemical
elements in the d-block of the periodic table), may include zinc
as a transition metal. The goal of this subsection is not to
discuss whether zinc is a transition metal— “transition metals”
is a human concept, people are in the habit of categorizing and
dividing, however, it does not matter for the chemical and
physical properties of zinc whether one classifies zinc as a
transition metal or not. The unique properties of zinc and the
reason why chemists discuss where to categorize zinc is due to
zinc's electron configuration ([fig:Bohr-model-diagram-zinc]) –
the fully filled d orbitals of zinc are energetically stable, and
usually, zinc can lose only 4s electrons, leading to the
formation of stable Zn(II), which is entirely different to the
biologically active transition metals. This property of zinc of
occurrence in only one oxidation state (2+) prevents zinc from
partaking directly in redox reactions. The total filling of
zinc's d orbitals makes zinc a 'spectroscopically quiet' metal –
zinc complexes have no color. Contrary to metals that have
partially filled d-shells, zinc has almost no spectroscopic
signature Penner-Hahn 2005. This limitation of Zn(II) being
redox-inert in biology might produce a false view of its use by
evolution in proteins.
[float Figure:
[Rysunek 2.1:
<fig:Bohr-model-diagram-zinc>Bohr model diagram for of zinc.
]
]
Zinc, after iron, is the second most abundant d-block element
found in proteins, Andreini et al. 2006; Andreini et al. 2006; Andreini, Bertini, and Rosato 2009
moreover it is the only d-block element that is found in all
classes of enzymes, Vallee and Auld 1990 this is possible due to
being a Lewis acid by Zn(II).
In solution Zn(II) ions exist in equilibrium as an octahedral
hexaaquo complexes ([Zn(H₂O)₆][superscript:2+]) and as a tetrahedral [Zn(H₂O)₄][superscript:2+], with equilibrium shifted towards the first one. Krężel and Maret 2016
Due to the total filling of the d orbitals, the transition
between the octahedral to tetrahedral coordination (typical for
complexation by proteins) in the case of zinc does not entail an
energetic penalty, Lachenmann et al. 2004 additionally the
release of solvent from the aqua-complex involves an increase in
degrees of freedom in the system, which is energetically
favorable due to the increase of . Krężel and Maret 2016 that
utilize this unique , are unique as well, and bind this metal ion
in an unusual fashion.
2.2 zinc-binding Sites<sec:Zinc-Binding-Sites>
Zinc(II) binding proteins are extremely diverse in structure and
function. Zn(II) is an intermediate Lewis acid in terms of
hardness according to the , theory formulated by Ralph Pearson. Pearson 1963
Because of this, Zn(II) can be coordinated by zarówno the sulfur
atom of the cysteine (a soft Lewis base), the nitrogen atom of
the histidine (an intermediate Lewis base), as well as by
carboxyl anions derived from the aspartates and glutamates (a
hard Lewis base). The most common found in the zinc-binding
sites is cysteine, followed by histidine and acidic residues,
i.e. aspartic acid and glutamic acid [footnote:
Those findings are based on the structural data from the . Just
because there are structures of such does not necessarily mean
that this is true in all existing zinc-binding sites existing.
]. Laitaoja, Valjakka, and Jänis 2013 For zinc-binding sites,
ranging from two to seven were found in the literature Sousa et al. 2009; Andreini, Bertini, and Cavallaro 2011
, however, di- and tricoordinate are zinc complexes are highly
reactive, meaning that the stable zinc-binding sites have at
least equal four. Thorough manual analysis performed by Laitaoja, Valjakka, and Jänis 2013
ruled out of physiological existence of zinc-binding sites of
higher than seven. Laitaoja, Valjakka, and Jänis 2013 The most
common zinc-binding sites found in the have =4, which
corresponds to the tetrahedral geometry. The low occurrence of
zinc-binding sites with >4 is probably due to the small size of
Zn(II) (~74 pm and ~88 pm for =4 and =6, respectively), which
causes molecular repulsion between the Miessler, Fischer, and Tarr 2013
. The equal to two in zinc-binding sites is usually caused by
the not-fully resolved structure of zinc-binding site having
higher (e.g. structure of zinc hook domain, [https://www.rcsb.org/structure/6ZFF||6ZFF]
Soh, Basquin, and Gruber 2021), likewise =3 in most cases is
unresolved zinc-binding site with =4, which is a very common
situation in enzymatic zinc-binding sites, where the missing,
unresolved is often a water molecule. The different number and
types of often translate into differences in the properties of
the coordination sphere and zinc-binding site formed.
The process of Zn(II) complexation by polypeptide often entails
the formation of a stable conformation. The zinc-binding sites
can exist as a pre-made arrangement of amino acids in the space,
where zinc-binding does not cause huge structural changes in
protein (usually in enzymes), however, the zinc-binding can
promote protein folding and formation of separate protein domains
(e.g., zinc fingers), this process is often in structural zinc
sites. Kochańczyk, Drozd, and Krężel 2015
Zinc-binding sites can be divided by function but also by their
architecture. Functionally, zinc sites can be divided into:
catalytic Zn(II) can be found in the active center of any of the
six classes of enzymes distinguished by the International Union
of Biochemistry and Molecular Biology. In almost every enzyme
Zn(II) acts as a and one of the zinc is a water molecule. In
other proteins with no catalytic function, Zn(II) is usually
bound by the atoms of the side chains of the amino acid residues.
The interaction of the water molecule with the zinc cation allows
for the transient replacement of the water molecule by the
coordination of the substrate molecule. However in the case of
the most studied carbonic anhydrase, , the water molecule is not
replaced displaced from the zinc-binding site, Zn(II) lowers the
water p, facilitating dissociation to a hydroxide, that reacts
with .
structural zinc-binding sites are usually characterized by a high
affinity towards Zn(II). A classical example of domains
containing structural zinc-binding sites are . An important
structural characteristic that accompanies structural binding
sites is the presence of a hydrophobic core that stabilizes the
zinc-protein complex Padjasek et al. 2020. The function of the
zinc-binding site in such domains is to stabilize the – a good
example of this may be a peptide, an example of de novo protein
design that bases on the of zinc finger Dahiyat, Sarisky, and Mayo 1997
. The de novo peptide maintains the same as a fragment of ,
however, the fragment compared with shares only four amino acids
in the same positions Dahiyat, Sarisky, and Mayo 1997. What is
most notable is the change of the sequence with hydrophobic and
aromatic residues. Which stabilizes energetically the due to the
hydrophobic effect. Maybe even the greatest importance of
structural zinc-binding sites can be seen in the case of chimeric
, where zinc hook from can be efficiently replaced the hinge Tatebe et al. 2020
while taking much less volume.
regulatory zinc-binding sites usually have medium affinities
towards Zn(II), this property allows them to associate with
Zn(II) during the Zn(II) influx and dissociate during Zn(II)
efflux. Zinc is an essential metal for life, however in case of
Zn(II) excess, it exerts a toxic effect on the cell – thus the
concentration in the cell needs to be regulated. So the cell
regulates the concentration by various mechanisms. The
functional division of the zinc-binding sites proposed here
distinguishes three classes that are involved in regulation,
which shows the importance of maintaining physiological in the
cell. An example of the regulatory zinc-binding site can be the
from . Zinc-binding by is characterized by remarkable lability
observed both in vitro and in vivo Qiao et al. 2006. This
lability is essential for fulfilling its regulatory function.
transporting zinc-binding sites are responsible for cellular
influx or efflux of Zn(II). Similarly to regulatory and buffering
binding sites, the transporting zinc-binding sites are
characterized by a medium affinity towards Zn(II). The examples
of the transporting zinc-binding sites are , , and , which are
regulated by . The driving force for Zn(II) transport (which
requires the conformational, and affinity towards Zn(II) change)
may be a proton motive force, which is the case of human
proton-coupled zinc antiporters (ZnT family), however, in the
case of the bigger Zn(II) transporter family (ZIP), the driving
force is unknown yet Coudray et al. 2013; Bafaro et al. 2017.
buffering zinc-binding sites are found in proteins that bind
Zn(II) in order to maintain physiological in the cell. Those
sites are characterized by a medium, but a broad range of
affinities, which allows for maintaining adequate . A flagship
example of buffering binding sites can be found in the
metallothioneins, which may bind up to seven zinc, within a broad
affinity range Krężel and Maret 2007.
The number of zinc-binding sites classes related to the
regulation of the in the cell together with the occurrence of
zinc ions in cellular vesicles potentially could be explained by
the fact that no Zn(II)-storage proteins (akin to the iron
storage proteins – ferritin) was found. Ferritin for example may
store several thousand iron ions trapped in its core Maret 2017.
In terms of architecture, the zinc-binding sites can be divided
into five classes:
intraprotein zinc-binding is the most common type of zinc
coordination in proteins. The intraprotein binding is understood
as a formation of a zinc-binding site only by singular
polypeptide chains.
clustered zinc-binding sites are multinuclear. The clustered
zinc-binding sites have more than one Zn(II) per binding site,
this is achieved by the presence of the bridging (usually sulfur
from cysteines). These sites are quite thermodynamically stable,
while at the same time kinetically labile. No wonder why
metallothioneins (mentioned in the buffering zinc-binding sites)
utilize this type of binding. This type of binding is seen in
most zinc enzymes, , etc.
interprotein zinc-binding sites are not so common (or at least
are not so commonly discovered and described) as intraprotein
zinc-binding sites. Interprotein zinc-binding sites are formed by
at least two polypeptide chains, so in this architecture, Zn(II)
bound at the of two proteins, takes a structural part in the
formation of the quaternary protein structure. The formation of
interprotein zinc-binding sites, referred to in the thesis as ,
may be obligatory to form a quaternary structure, or Zn(II) may
bind to already formed protein-protein complex, to further
stabilize it Kochańczyk, Drozd, and Krężel 2015. To date only a
few have been studied in terms of the complexes' stabilities,
this includes - Kocyła and Krężel 2018and - Davis and Berg 2009
complexes, and various orthologs studied by me and my colleagues
Padjasek et al. 2020; Tran, Padjasek, and Krężel 2022.
In terms of the time span of binding one can introduce two
additional classifiers:
transient zinc-binding sites are characterized often by medium
affinities toward Zn(II). The role of such transient zinc-binding
sites is to bind Zn(II) temporally, which is the case in the
buffering, transporting, and regulatory zinc-binding sites. Those
sites need to bind Zn(II) in a kinetically labile way in order to
fulfill its functions.
permanent zinc-binding sites are kinetically inert. This type of
binding is characteristic of structural and enzymatic binding
sites. The Zn(II) bound to those sites is not easily dissociated
from the site. Of course, the reaction is in equilibrium so the
subunit exchange still happens (see equations and [eq:Ka_definition]
), however, the binding is characterized by a low value of .
Investigation of intermolecular metal-binding sites
Zn(II) ions were found to be essential for the growth of the
toxic mold – Aspergillus niger in 1869, nearly one hundred years
later, in 1961, it was postulated that zinc is essential also for
humans. Almost 20 years later the first structures of proteins
containing Zn(II) started to appear. It will not be an
exaggeration if I say that the number of scientific questions is
proportional to the amount of new data—with the advent of the
first Zn(II)-containing macromolecular structures, questions
about the factors that govern the formation of the zinc-binding
sites started to appear. With more Zn(II)-containing structures
deposited in the some tendencies became clear (see [sec:Zinc-Binding-Sites]
), however, the fact that the Zn(II) ions (and other metal ions)
can be bound at the of two or more macromolecules was rather
ignored – one of the first reviews of structural knowledge of
metal-binding sites in proteins appeared in 1992[a], Tainer, Roberts, and Getzoff 1992
whereas the first review regarding the binding of metals at
appeared in 2014[a]. Song et al. 2014 This Chapter will focus on
the methodology of how one can investigate the metal-binding
sites at .
3.1 In silico investigation of intermolecular metal-binding sites
3.1.1 The problems of in silico investigation of intermolecular
metal-binding sites
To date, there are no computational algorithms or other methods
that are able to identify or classify intermolecular
metal-binding based on protein sequence or structure. The problem
of metal-binding sites prediction was undertaken in the past
several times. Apart from the availability of such tools, which
is often unacceptable Ye et al. 2022, the quality of the proposed
classifiers is often poor. [margin:
Classifier is a tool that assigns an element to a certain class.
An algorithm that tells whether a protein is a based on the
protein attributes is a classifier.
] Usually, the authors of such algorithms present the quality and
efficiency of their classifier in a reliable way, e.g., by
showing various kinds of statistics like receiver operating
characteristic curve in the case of binary classifiers, the
presented tools at first glance seem to work very well, however,
to date, the use of such tools is limited. For sure one cannot
accuse the authors of such articles of scientific misconduct or
ill-faith – usually, the problem lies deeper. The are two main
problems with the classifiers: the first one is the problem with
the initial data set, which eventually is a source for creating
the training and the test data sets, and the second problem is
the choice of explanatory variables.
The availability of well-annotated (due to manual curation)
protein sequence data is quite good. The primary source of such
information is scientific articles. A publication that treats
whether a protein binds metal or not has a number of methods that
show and describe this binding. Such publication does not need to
include information about the structure of the protein, this
information is not necessary to prove the interaction of the
metal with the protein. Typically, this type of data can be
aggregated in , where metal-interacting residues are often
assigned based on similarity and sequence. The problem with this
type of data is the sparsity of annotations – not every known is
annotated in to interact with metal. This leads to the attempts
to create one's own data set, which in turn involves the use of
the other type of input – the structural data deposited in the ,
however, the metal-binding sites in the are not annotated to be
a true physiological or adventitious metal-binding sites. This,
in turn, leads to the use of simple algorithms that sort “true”
from “false” metal-binding sites in the data set based on simple
rules, i.e., the distance between the metal and , the number of ,
type of , etc., however, this approach is far from being
accurate.
The problem with the selection of explanatory variables is most
acute with classifiers that are based on the sequence. Whereas
structural-based classifiers may involve the abundance of
meaningful variables for the classifier (e.g., hydrophobicity, a
charge of the amino acids, geometry, position in space, etc.),
the effort of prediction of a metal-binding site based solely on
the sequence might be similar to the prediction of the sales on
stock exchange based using only stock shares name. Nevertheless,
in some specific cases, where there is an existing pattern in the
sequence it is possible to build sensitive and specific
classifiers, e.g., the prediction of zinc fingers based on
sequence is possible due to the existence of well-defined input
data and the existence of patterns in the sequences Sathyaseelan, Patro, and Rathinavelan 2023
.
The above problems with the prediction of metal-binding sites are
the same for intermolecular metal-binding sites, as well,
moreover, the fact of two or more interacting macromolecules
increases the complexity of the problem. So how to search for
metals bound at if there are no viable in silico tools? The
question is addressed below in [subsec:Obtaining-knowledge-about-InterMBS]
.
3.1.2 Obtaining knowledge about intermolecular metal-binding
sites<subsec:Obtaining-knowledge-about-InterMBS>
As in the case of information about the interaction of the metal
with the protein, here also publications may also be the primary
source of information about metal-binding on . However, the lack
of recognition of inter-protein metal ion binding has partly
contributed to the fact that even if a publication is accompanied
by the deposition of a structure in the , the fact of
inter-protein metal ion binding is not necessarily commented on
in the publication in any way. An additional problem with
extracting knowledge directly from publications is the fact that
the search for this type of information can be time-consuming.
There may be problems with the availability of the article (not
every article is published as an open access article), and the
fact that there is no tool that aggregates the articles based on
the information on whether the described protein within the
article has an inter-protein metal-binding site, makes this
approach unreasonable. Nevertheless, it is not out of the
question that this type of information aggregation will change
(not necessarily regarding the ) with the increasing number of
open access articles published each year and the development of
technologies related to natural language processing.
To date, the best source of information on proteins that bind
metal ions in an intermolecular manner are structural databases
like . Since contains all deposited structures (whether they