Introduction to Tensors (PyTorch Tutorial)

6 minute read

Published:

Lets talk about tensor. What is a tensor?

Typically, machine learning models are fed a large number of data to train. However, these datas need to converted into numbers so that different ml algorithms can be applied on them. We convert different datas into tensors which are basically multidimensional array/matrix or we can say that tensors are numerical representation of the data.

import torch 

Pytorch has a dedicated documentation page on tensors. To learn more on tensors you can visit the official page.

scalar=torch.tensor(7)
scalar
tensor(7)

To create a tensor we call the torch.tensor() and pass the value as a parameter. Above we have created a scalar which is of type torch.tensor and with ndim we can get the dimension of the scalar. As scalar has a single value, hence its dimension is zero.

scalar.ndim
0

Let’s create a vector/1-d array now.

vector=torch.tensor([2,3])
vector.ndim
1
vector.size()
torch.Size([2])

Here, as we created a vector, the dimnesion is 1 and size() returns the items in the vector. Now that we have seen vectors lets see some matrix/2-d array.

Matrix=torch.tensor([[2,3],[5,6]])
Matrix
tensor([[2, 3],
        [5, 6]])
Matrix.shape
torch.Size([2, 2])
Matrix.ndim
2

Using shape we can see the size of the MATRIX is of [2,2] because MATRIX is two elements deep and two elements wide. Lets dive into Tensors now.

Tensor = torch.tensor([[[1, 2, 3],
                        [7, 8, 1],
                        [2, 4, 5]]])
Tensor
tensor([[[1, 2, 3],
         [7, 8, 1],
         [2, 4, 5]]])

Wooo… We just created a Tensor. Lets check out the dimension…

Tensor.ndim
3

Lets count the number of brackets at the beginning in the Tensor. Can you guess something?? The dimension is 3 as the brackets we counted.

Tensor.shape
torch.Size([1, 3, 3])

What is with that shape [1,3,3]? Lets think…. Wait, it is counted from the outer brackets to inside. The outer bracket has one element after that the inner bracket has three element [1,2,3] [7,8,1] [2,4,5] and the each element has three more elements 1,2,3.

Creating Tensor using random numbers

Using rand() function we can create a random tensor with the expected size as a parameter. Here, we have set the size=(3,4) and got a Tensor back with random numbers.

random=torch.rand(size=(3,4))
random
tensor([[0.2959, 0.6114, 0.2762, 0.2221],
        [0.5545, 0.3278, 0.0836, 0.9979],
        [0.3212, 0.9734, 0.7829, 0.5209]])
random.dtype
torch.float32
random_tensor=torch.rand(size=(1,3,3))
random_tensor, random_tensor.ndim
(tensor([[[0.9606, 0.2998, 0.2386],
          [0.6171, 0.6288, 0.8362],
          [0.3800, 0.7077, 0.8832]]]),
 3)

We can create tensors with zeros and ones if needed as we might need some dummy tensors for different purpose.

tensor_with_ones=torch.ones((3,2))
tensor_with_ones, tensor_with_ones.dtype
(tensor([[1., 1.],
         [1., 1.],
         [1., 1.]]),
 torch.float32)
tensor_with_zeros=torch.zeros((3,4))
tensor_with_zeros, tensor_with_zeros.dtype
(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 torch.float32)

Informations of Tensors

Tensor has different datatype. In the official documentation you can checkout different type of datatypes in PyTorch. While working in machine learning the most used type is torch.float or torch.float32. Tensors can be an integers, floats and boolean type. There are different bit implementation of all. Floats can be 8, 16, 32 or 64 bit. The higher the bit the more precision we get. So higher bit tensor calculation gives us better result with the trade-off being computation cost where using a lower precision gives faster answer but will be less accurate. Also, sometimes we might see tensor.cuda mentioned in different places that means GPU will handle the calculation of that tensor. For macbooks its mps. So there can be device specific tensors. Noramlly, tensors are created and device is set to cpu and to use that in GPU we need to specifically set instruction for that tensor being used in GPU.

tensor_t1=torch.arange(0,10,1.0) # we can use arange(start,stop,step) to generate a tensor
tensor_t1
tensor([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
tensor_t1.dtype, tensor_t1.device
(torch.float32, device(type='cpu'))

It seems our created tensor is running in cpu, lets change the device to “mps” as I am running it on macbook.

device="mps" if torch.backends.mps.is_available() else "cpu" #change "mps" into cuda if you are using cuda enabled gpu
device
'mps'
tensor_t2_gpu=tensor_t1.to(device)
tensor_t2_gpu.dtype, tensor_t2_gpu.device
(torch.float32, device(type='mps', index=0))

Now that we have converted the device type of our tensor to gpu , how do we get back to cpu?? Lets try to convert the device type gpu to cpu in the following cell

tensor_t2_cpu=tensor_t2_gpu.cpu().numpy()
tensor_t2_cpu
array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.], dtype=float32)

using cpu() we can change the device from gpu to cpu . The above returns a copy of the GPU tensor in CPU memory so the original tensor is still on GPU.

Changing tensor datatype

Sometimes there might be a scenario when you need to change tensor datatype into other type for different purpose. We can use type() to convert one tensor to another datatype and pass the desired tensor type as a parameter.

tensor_t3=torch.arange(0,10,1.1)
tensor_t3, tensor_t3.dtype
(tensor([0.0000, 1.1000, 2.2000, 3.3000, 4.4000, 5.5000, 6.6000, 7.7000, 8.8000,
         9.9000]),
 torch.float32)
tensor_t4=tensor_t3.type(torch.int16)
tensor_t4
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=torch.int16)

Reshaping, Squeezing and Unsqueezing of TENSORS

Sometimes we might need to change the dimension of the tensors without changing the elements of the tensors.

tensor_t5=torch.arange(0,10,1.03)
tensor_t5
tensor([0.0000, 1.0300, 2.0600, 3.0900, 4.1200, 5.1500, 6.1800, 7.2100, 8.2400,
        9.2700])
tensor_t5.shape
torch.Size([10])
tensor_t5=tensor_t5.reshape(1,10)
tensor_t5, tensor_t5.shape
(tensor([[0.0000, 1.0300, 2.0600, 3.0900, 4.1200, 5.1500, 6.1800, 7.2100, 8.2400,
          9.2700]]),
 torch.Size([1, 10]))

We can convert a tensor into 1-d using squeeze().

print(f"Tensor vlaues: {tensor_t5}")
print(f"Tensor previous shape, {tensor_t5.shape}")
print("======================================")
print(f"After squeezing Tensor values: {tensor_t5.squeeze()}")
print(f"After squeezing Tensor shape: {tensor_t5.squeeze().shape} ")
Tensor vlaues: tensor([[0.0000, 1.0300, 2.0600, 3.0900, 4.1200, 5.1500, 6.1800, 7.2100, 8.2400,
         9.2700]])
Tensor previous shape, torch.Size([1, 10])
======================================
After squeezing Tensor values: tensor([0.0000, 1.0300, 2.0600, 3.0900, 4.1200, 5.1500, 6.1800, 7.2100, 8.2400,
        9.2700])
After squeezing Tensor shape: torch.Size([10]) 
tensor_t5=tensor_t5.squeeze()
tensor_t5
tensor([0.0000, 1.0300, 2.0600, 3.0900, 4.1200, 5.1500, 6.1800, 7.2100, 8.2400,
        9.2700])

We can add an extra dimension using unsqueeze().

tensor_t5.unsqueeze(dim=0)
tensor([[0.0000, 1.0300, 2.0600, 3.0900, 4.1200, 5.1500, 6.1800, 7.2100, 8.2400,
         9.2700]])

Here I have tried to show most common used tensor methods. There are lots of things that are not covered over here. If needed just visit the official pytorch documentation for more on tensors.