With the development of generative models, abused Deepfakes have aroused public concerns. As a defense mechanism, face forgery detection methods have been intensively studied. Remote photoplethysmography (rPPG) technology extract heartbeat signal from recorded videos by examining the subtle changes in skin color caused by cardiac activity. Since the face forgery process inevitably disrupts the periodic changes in facial color, rPPG signal proves to be a powerful biological indicator for Deepfake detection. Motivated by the key observation that rPPG signals produce unique rhythmic patterns in terms of different manipulation methods, we regard Deepfake detection also as a source detection task. The Multi-scale Spatial–Temporal PPG map is adopted to further exploit heartbeat signal from multiple facial regions. Moreover, to capture both spatial and temporal inconsistencies, we propose a two-stage network consisting of a Mask-Guided Local Attention module (MLA) to capture unique local patterns of PPG maps, and a Temporal Transformer to interact features of adjacent PPG maps in long distance. Abundant experiments on FaceForensics + + and Celeb-DF datasets prove the superiority of our method over all other rPPG-based approaches. Visualization also demonstrates the effectiveness of the proposed method.