Symbolic visual sketching is a hallmark of human creativity, enabling the externalization of abstract concepts through figurative representations. Yet, creative expression can be constrained by pervasive conceptual associations— culturally learned mappings between abstract ideas and standard visual forms (e.g., a dove symbolizing peace). Generative AI has the potential to liberate such fixations due to AI’s access to a broad range of content and ideas, but it remains unclear whether and how inspiration from verbal or visual modalities better mitigates fixations. Here, we hypothesized that the verbal modality induces greater conceptual divergence than the visual modality by bypassing perceptual constraints, whereas the visual modality may reinforce perceptually familiar mappings of visual representations. Participants generated sketches of abstract concepts (e.g., “time”) before and after receiving GPT-4–generated verbal or visual inspiration. Drawings were analyzed using deep neural networks— comparing perceptual features (VGG16-based) and semantic-perceptual content (CLIP-based)—as well as human and GPT-4 scoring for creativity. We found that verbal inspiration significantly increased semantic distance and uniqueness, whereas visual inspiration led to minimal semantic divergence from the initial sketches. Importantly, low-level perceptual features remained unchanged across conditions, indicating that verbal prompts primarily influenced high-level conceptual framing of the sketches rather than their visual features. These findings demonstrate the effect of modality on mitigating cognitive fixations, with the verbal modality enhancing more unconventional visual sketching.