A long time ago I was interested in finding out what the intersection of all languages sounds would be, would it be an empty set or was there enough overlap that someone designing the next esperanto could build it using universal sounds thus allowing any human speaker to pronunciate it without difficulty. Turns out there is a good degree of overlap if you allow for a very small bit of fuzzyness. For instance, 'a','m', and 'p' are some of the most common sounds across all languages, which is also interesting when you realize "mama" and "papa" - with slight variations - display high universality in their pronunciation across all languages.
I think if you take the complete intersection, the set is empty. However, the five standard vowels /a/, /e/, /i/, /o/, /u/ are fairly common (although many Native American languages have four vowels). For consonants, /p/, /t/, /k/, /m/, /n/, /s/, and /l/ are probably the most common.
Esperanto's phonotactics is basically Zamenhof's Slavic dialect (Polish and Belorussian). However, Lojban tries for a more neutral disposition, even to the point of permitting epenthetic vowels in complex consonant clusters (stressing that they need to be distinguishable from the 6 regular vowels--the 5 vowels above plus /ə/).
It should be noted that when languages have restrictive phonemic inventories, the challenge is not so much in producing the other sounds as it is in recognizing them as distinct sounds.
If you're not already familiar with https://en.wikipedia.org/wiki/Toki_Pona , check it out. "Both its sound inventory and phonotactics (patterns of possible sound combinations) are found in the majority of human languages and are therefore readily accessible."
Many Middle Eastern languages/dialects don't have the 'p' sound. For example, a native Egyptian with no training in English pronunciation would pronounce 'people' as 'biibul'.
for more info see: https://www.quora.com/What-are-the-most-common-phonemes-amon...
https://en.wikipedia.org/wiki/Mama_and_papa