Activity recognition is the ability to identify and recognize the action or goals of the agent. The agent can be any object or entity that performs action that has end goals. The agents can be a single agent performing the action or group of agents performing the actions or having some interaction. Human activity recognition has gained popularity due to its demands in many practical applications such as entertainment, healthcare, simulations and surveillance systems. Vision based activity recognition is gaining advantage as it does not require any human intervention or physical contact with humans. Moreover, there are set of cameras that are networked with the intention to track and recognize the activities of the agent. Traditional applications that were required to track or recognize human activities made use of wearable devices. However, such applications require physical contact of the person. To overcome such challenges, vision based activity recognition system can be used, which uses a camera to record the video and a processor that performs the task of recognition. The work is implemented in two stages. In the first stage, an approach for the Implementation of Activity recognition is proposed using background subtraction of images, followed by 3D- Convolutional Neural Networks. The impact of using Background subtraction prior to 3D-Convolutional Neural Networks has been reported. In the second stage, the work is further extended and implemented on Raspberry Pi, that can be used to record a stream of video, followed by recognizing the activity that was involved in the video. Thus, a proof-of-concept for activity recognition using small, IoT based device, is provided, which can enhance the system and extend its applications in various forms like, increase in portability, networking, and other capabilities of the device.