I would love some help in my parallel programming class on understanding how to do my PA3 assignment (cuda and transpose matrix multiplication) and future assignments so that I can pass my final. I have two assignments that I need help with at the moment, PA2 vectorization which is due tonight and PA3 which is due on 11/25 but my main focus will be on my PA3 assignment. I have other files for these assignments and will send it when the time is right. I feel like I lack the fundamentals to understand but am a fast learner. I also have recording of the lectures but I do not understand what is going on. Please help! :)