Abstract
We present in this paper a model representation of a report extracted from a radiological collaborative social network, which combines textual and visual descriptors. The text and the medical image, which compose a report, are each described by a vector of TF-IDF weights following an approach "bag-of-words". The model used, allows for multimodal queries to research medical information. Our model is evaluated on the basis imageCLEFMed' 2015 for which we have the ground truth. Many experiments were conducted with various descriptors and many combinations of modalities. Analysis of the results shows that the model, which is based on two modalities allows to increase the performance of a search system based on only one modality, that it be textual or visual.