-
Notifications
You must be signed in to change notification settings - Fork 6
C Fundamentals
Let's start by writing a simple program, that prints: "Hola, mondo!"
But first, a little background.
- C code is comprised of ‘statements’
- Each statement is finished by a semi-colon
- Every program has an ‘exit code’
- Indicates whether a program executed successfully, or not
- The convention for a ‘success’ is
0
- C is a compiled language
- Unlike other languages like Python or R where you can run the program from the text script, C code has to be compiled to an executable; a set of instructions that the machine can interpret and operate on.
The program compilation is carried out by our installed gcc
compiler.
Let's begin Initial script:
#include <stdio.h>
// int class returns an integer
int main()
{
return -1;
}
Compiling the program
gcc -o hola hola.c
Note, due to a UNIX security measure, you might might have to make the program executable, by modifying the permissions.
For macOS:
chmod +x program_name
For Linux:
chmod 775 program_name
Execute a the program
./hola
Initially, there is no printed output.
However, we can expect the ‘contents’ of the output.
Print output
echo $?
The minimum contained ‘value’ is a 1 bit (8-bit), which comprises of 255 bytes (which is the value we return).
Solution:
Use stdout
module.
Here is ‘hello world’ program:
💡 fprintf
is used to print the string content in file but not on the stdout
console.
#include <stdio.h>
int main()
{
fprintf(stdout, "hola, mundo\n");
return -1;
}
If you don’t include the \n
line, the following ‘shell prompt’ will show up on the same line
In UNIX, every program has two outputs: stderr
and stdout
, and we can direct the output accordingly.
#include <stdio.h>
int main()
{
fprintf(stdout, "hola, mundo out\n");
fprintf(stderr, "hola, mundo err\n");
return -1;
}
We can re-direct all outputs from stdout
to a text file, using the >
operator
./hola > out
We can easily view the contents of any text file using the universal UNIX cat
tool (i.e. run cat out
).
Assuming your program only contains a stderr
output (‘unsuccessful’) you can capture this with the 2>
argument:
./hola > out 2> err
cat out
cat err
Note that in C, while there are ‘conventions’ for indentation for code-readability, indentation has no effect, except for the pre-processor macros
- C is a ‘weakly typed language’ - you can do whatever you want with variables, unlike something like Python.
-
$?
variable is where the captured output is stored, in the shell - When declaring variable, you have to give it a type
First type:
#include <stdio.h>
int main()
{
int a = 2;
int b = 4;
int c = a + b;
return c;
}
Compile it to give warnings:
gcc -Wall -pedantic -o variables variables.c
For example, if you try to use float in variables.c
, it will generate warnings and errors.
It can intelligently convert the float to an int, and give it a truncated value.
Manipulating bytes
2’s complement
2^32 bits in 4 bytes (integer takes up 4 bytes)
But this capacity is split in half, because half takes up the negative domain, and the other half the positive domain
You can shift a number by 1 decimal point, with the <<
(bit shift) operation.
#include <stdio.h>
int main(void)
{
unsigned int a = 1;
fprintf(stderr, "%d\n", a<<1);
return 0
}
Every bit shift doubles the number (relative to the previous iteration).
Characters (char
#include <stdio.h>
int main()
{
unsigned char mychar = 5;
unsigned long l = 8000000;
return 0
}
Unless you specify otherwise, standard error gets printed/output to standard out.
As seen in the previous section, in C you can take nothing for granted. Naturally, this logic applies to using memory.
Memory can be represented as a ‘list’ of chunks, where each chunk represents a byte minimum amount of allocatable memory).
- 255 is maximum number, due to each byte holding 8-bits
- And this isn’t considering signed integers
The bytes are indexable from
0...n
We can physically assign a discrete number to a ‘chunk’ of memory, with an arbitrary value. Each chunk assigned from the first value, is relative to that address. As long as the ‘index’ lies within the ‘total amount of addressable bytes’ within our memory, any ‘number’ is theoretically valid,
In C there is no such thing as a ‘list’. There are simply ‘arrays’ - that is, an item, with an index. One way of creating a ‘list’ is creating a function that can address an item to a discrete ‘index’
We use the asterisk *
to declare a ‘pointer’
- List isn’t a ‘variable’, but rather a ‘location’/index
- The proper nomenclature way to describe this index is
address
- Use
malloc
function- Get documentation in UNIX shell with
man malloc
- Get documentation in UNIX shell with
Code example:
Request memory manually - in this case with malloc
using just 10 bytes.
Casting int *list
essentially indicates that we are treating the memory as an array of INTEGERS.
Note that whenever you request memory, you have to give it back (hence, the free
function)
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *list = malloc(10);
free(list);
fprintf(stdout, "%p\n", list);
}
Compile + run:
$ gcc -Wall -o malloc_simple malloc_simple.c
$ ./malloc_simple
> 0x600001238040
- Note that output is hexadecimal (as opposed to binary) - in this specific case, around 105,553,135,370,304 (105 trillion) 😶
- This value can be calculated with an appropriate calculator, or an online converter.
- The number of the output is arbitrary - it just has to fit within whatever amount of bytes we have in our accessible memory.
So for example, if you have 16 GB memory, you have a total amount of 1,6e+10 (around billion) possible ‘addresses’ you can access… something is wrong, I don’t think I’ve understood it correctly
Let’s actually do something with our allocate memory:
- Store numbers 1-10 in memory
- De-reference the memory location
- We can construct a simply for-loop to ‘extract’ the index of the list, and a separate for-loop to print it
💡If you're unfamiliar with a for-loop, it's one of the most fundamental programming structures, which exist in almost every language. It's operation usual follows the following logic: 'for every item in this list'; do <some_action>. This action could vary from simply printing it, to performing some insane statistical model to it. The for loop is fundamental to automating a repetitive task - when you apply an action to an entire column in Excel, under the hood it's basically performing a for-loop
int main()
{
int *list = malloc(10);
fprintf(stdout, "%p\n", (void *)list);
for (int i = 0; i < 10; i++)
{
list[i] = i;
}
for (int i = 0; i < 10; i++)
{
fprintf(stdout, "%d\n", list[i]);
}
free(list);
}
Note, that due to legacy reasons, we may have to cast the list as a pointer (void *)
in the first print statement.
-
A
void *
is a generic pointer type that can point to any memory type. -
int *list
casts thevoid *
pointer to a anint *
pointer. -
This runs for some systems, but we have cheated
-
An int takes 4 bytes (this is a legacy thing - it doesn’t make any sense, roll with it)
- After 2 ints (8 bytes), we should have run out of memory!!!
-
Therefore, we need to update our
int *list
, to take 10 * 4 bytes (4 bytes representing the amount of bytes a int occupies)
int *list = malloc(10 * 4);
- This doesn’t mean that we should frivolously allocate memory. A single byte can for example hold a boolean value (
0/1
)
If we wanna be fancy, we can substitute the 10 * 4
statement to sizeof(int)
int *list = malloc(10 * sizeof(int));
fprintf(stdout, "int has %lu bytes\n", sizeof(int));
Final iteration:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *list = malloc(10 * sizeof(int));
fprintf(stdout, "int has %lu bytes\n", sizeof(int));
fprintf(stdout, "%p\n", (void *)list);
for (int i = 0; i < 10; i++)
{
list[i] = i;
}
for (int i = 0; i < 10; i++)
{
fprintf(stdout, "%d\n", list[i]);
}
free(list);
}
Fun fact:
fprintf
references a file pointer
You can open and write to a file within the program without having to redirect stdout
, using a similar syntax to python:
FILE *fp = fopen(”name”, “w”);
You can do the same thing for 20 numbers, between 0-1 instead. An easy way is to simply divide each index by 10 - putting them all between 0-1.
In order to do this, we have to cast *list
as a float. By treating the list as an array of FLOATS, it allows us to perform math on it that returns valid float values.
#include <stdio.h>
#include <stdlib.h>
int main()
{
float *list = malloc(20 * sizeof(int));
fprintf(stdout, "%p\n", (void *)list);
for (int i = 0; i < 20; i++)
{
list[i] = (float)(i)/10;
}
for (int i = 0; i < 20; i++)
{
fprintf(stdout, "%f\n", list[i]);
}
free(list);
}
Compiling + running it gives:
$ gcc -o mal_20 malloc_20_0_1.c
$ ./mal_20
>
0x135e06800
0.000000
0.100000
0.200000
0.300000
0.400000
0.500000
0.600000
0.700000
0.800000
0.900000
1.000000
1.100000
1.200000
1.300000
1.400000
1.500000
1.600000
1.700000
1.800000
1.900000
We've worked with numbers. Let's do some simple string manipulation.
The simplest way of representing characters, is with an ASCII value. Let's construct a program that takes a single string as an argument, and gives you the decimal (ASCII) representation of your string.
As a bonus, it will report your string, along with its length.
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
if (argc < 2){
fprintf(stderr, "Not enough parameters\n");
}
char *param = argv[1];
int slen = strlen(argv[1]);
fprintf(stdout, "Your string is: %s\n", argv[1]);
fprintf(stdout, "\tlength: %d\n", slen);
for (int i = 0; i < slen; i++){
fprintf(stdout, "%c - %d\n", param[i], param[i]);
}
}
Example usage + output:
# run
./str2dec HPC_Squad
# output
Your string is: HPC_Squad
length: 9
H - 72
P - 80
C - 67
_ - 95
S - 83
q - 113
u - 117
a - 97
d - 100
Explanation:
The output decimals are the ASCII table conversion, to what each character represents.
These decimal values are the simplest way of encoding characters in C.
Example of ASCII table (Note: for now we are using the decimal or ‘dec’ representation):
![Untitled 3](https://private-user-images.githubusercontent.com/121201280/250603312-082a4236-63f9-481a-b6de-242bcb1a6c7d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk0ODg4NzgsIm5iZiI6MTczOTQ4ODU3OCwicGF0aCI6Ii8xMjEyMDEyODAvMjUwNjAzMzEyLTA4MmE0MjM2LTYzZjktNDgxYS1iNmRlLTI0MmJjYjFhNmM3ZC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjEzJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxM1QyMzE2MThaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT01NjNmNTY2MTY4MDI0NDcxM2Q0MTU5MjNhZjAxYmQzZGQ0ZGZiZTQ1MWZhMDQyNDBkMTIxYjk5YTgzZThlZTc4JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.5GdVvLeFTHM9WGxThE7iL2jEwiYfYzfaYTbyLClTjTU)
Source: https://commons.wikimedia.org/wiki/File:ASCII-Table-wide.svg